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MAMMALIAN SEX COMB ON MIDLEG (mammalian Son) 
ACTS AS A TUMOR SUPPRESSOR 



Firirt nf ihn Tnv. 

The invention relates to a gene, mammalian sex comb on midleg (mammalian 
San), implicated in proliferative disorders, including malignancies, and in 
10 developmental processes. 

Backfiniinri nf thr Tnv«»nti^ n 

Cancer and malignancy therapies have included treatment with chemical 
toxins, radiation, and surgery. Genes known to be over-expressed or underexpressed 
15 in cancer are used for diagnosis of the disease and evaluation of a patient's 
progression with the disease and treatment. 

The study of transcription has provided information about cell differentiation: 
early in the development of a cell lineage, transcription factors direct development 
along a particular pathway by activating genes of a differentiated pbenotype. 
20 Differentiation can involve not only changes in patterns of expressed genes, but also 
involve me maintenance, of those new patterns. 

The genetic basis of mammalian development, and the genetic link between 
development and cancer has not been fully elucidated. There is a need in the art for 
knowledge of the key genes underlying mammalian cancer, particularly those also 
25 implicated in normal mammalian developmental processes. 

I 

Summary nf th*» Tn W nri nrf 

In one embodiment of the invention an isolated mammalian Son (mammalian 

4 * 

Scm) polypeptide is provided. The polypeptide comprises a sequence of at least 54 
30 consecutive amino acids of a sequence selected from the group consisting of SEQ ID 
NO. 2, SEQ ID NO.4, and SEQ ID NO. 6. 
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In another embodiment of the invention an isolated nucleic acid molecule is 
ponded. Tie nucleic acid molecule encodes a polypeptide having a sequence 
selected from the group consisting of SEQ ID NO. 2, SEQ ID NO.4, and SEQ ID 



5 According to yet another embodiment, an isolated nucleic acid molecule is 

proved which comprises at least 30 contiguous nucleotides select from the group 
of sequences consisting of SEQ ID NO: 1, SEQ ID NO:3, AND SEQ ID NO* 5 

In another embodiment of the invention, an antibody preparation i £ provided 
The antibodies specifically bind to an mammalian Scm polype ,Ude^d do not bind 
10 specifically p other mammalian proteins. 

In stm another embcx^ ^ 
method comprises: 

contacting a neoplasm with an efllcth-, eraoan; of a therapeutic agent 
comprising a ma^^pplypepHde which comprises a sequence selected from 
15 the group consisting of SEQ ID NO:2, SEC IBNO:4, and SFQ ID NO: 6, whereby 
growth of the neoplasm is arrested. 

In still another embodiment of th^vention . method of inducing cell 
differentiation is provided. The method comprises: ■,, , 

contacting a progenitor cell with * human Scm (hScm) polypeptide which 
20 comprises a sequence selected from the group consisting of SEQ m NO: 2 SEQ ID 
NO:4, and SEQ ID NO: 6, whereby differentiation of.the cell is induced. ' 

According to yet another embodiment of the invention a method of regulating 
cell growth is provided.. The method comprises: 

contacting a cell whose growth is uncontrolled with a human Scm (hScm) 
25 polypeptide which comprises a sequence selected from the grou^consisting of SEQ ID 
NO: 2, SEQ ID NO:4, and SEQ ID NO: 6, whereby growth of the cell is regulated. 

According to yet another aspect of the invention a pharmaceutical composition 
isprovrded. The composition comprises an effective amount of a therapeuUc agent 
comprising a mammalian Scm polypeptide which comprises a sequence selected from 
30 the group consisting of SEQ ID NO: 2, SEQ ID NO:4, and SEQ ID NO: 6, and a 
pharmaceutically acceptable carrier. 
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Another aspect of the invention is a method of diagnosing neoplasia. The 
method comprises: 

contacting (a) a tissue sample suspected of neoplasia isolated from a patient 
with (b) an mammalian Sam gene probe comprising at least 12 nucleotides of a 
5 sequence elected frdmthe group consisting of SEQ ID NO: 1, SEQ ID NO- 3 and 
SEQ.ID NO: 5, wherein a tissue which underexpresses mammalian San or expresses a 
r vjinant mamraaiian San is categorized as neoplastic. 
•< .• ): , .^According tO'ahdfher embodiment of the invention a method of diagnosing 

neoplasixis-prov^^ comprises: ' '* " 

10 contacting PGR primers which specifically hybridize with an mammalian San 

genesequence selected frohvthe group consisting of SEQ ID' NO: 1, SEQ ID NO- 3 
and SEQ ID NO: 5, with nucleic acids isolated from a tissue suspected of neoplasia- ' 
™ P Iif^ - m me ^ rf ^ ^ ^ ' 

15 . identic when tfe*^*^ dfe ft«n a sequence sunilarly 

from a nonnal human tissue. £:.";..*." < : j . - 

Inyetano^enU^ 
is provided. The methdd <x>mpns&; 

W coritac^g a bDNA probe with nucleic acids isolated from a tissue suspected of 
20. neoplasia, wherein the bDNA probe SpecificaUy hybridizes with an mammalian San 
gene sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO- 3 
md SEQ ID NO: 5; 

detecting hybrids formed between the bDNA probe and nucleic acids isolated 
from the tissue; ane 

25 identifying a mutation in the nucleic acids isolated from the tissue by 

comparing the hybrids formed with hybrids similarly formed using nucleic acids from 
a nonnal human tissue. 

According to still another aspect of the invention a method of diagnosing 
neoplasia is provided. The method comprises: 
30 contacting a tissue sample suspected of being neoplastic with an antibody 

selected from the group consisting of: one which specifically binds to wild-type 
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mammalian Son as shown in SEQ ID NO:2, 4, or 6, orane which specifically binds 
to an expressed mammalian Sera variant; 

detecting binding of the antibody to components of the tissue sample, wherein 
a difference in the binding of the antibody to components of the tissue sample, as 
5 compared to binding of the antibody to a normal human tissue sample indicates 
neoplasia of the tissue. • , . T 

Another aspect of the invention is yet another method: of: diagnosing neoplasia. 
The method comprises: 

contacting RNA from a tissne rospec^of being neoplastic with PCR primers 
10 which specifically hybridize to an mammalian Son gene- sequence as shown in SEQ 
ID NO: 1, 3, or 5, or o bDNA p^ob-, which .^ifically liybiidizes to said sequence; 

determining quantitative kyda of irammalian Say RNa iri foe tissue by PCR 
amplification or bDNA probe detection, wherein lower levels of mammaUan. San 
15 RNA as compared to, a .normal human, ^^.iitdic^^hsia. 

Also provided are micleic acid molecufcs, which can be used hi regulating a 
heterologous coding sequence. ^T^a^yrynfh bScm., -These^equences include the 
5' untranslated legion, of ,an hScm gene^the 3? untranslated region of an hScm gene, 
the prompter region of an hScra.gene, and an intron of<an hScns gene. 

Also provided by the present invention is a method of identifying modulators 
of hScm function comprising: . 

contacting a test substance with a human cell which comprises an hScm 
gene or a reporter construct comprising an Man promoter and a reporter gene; 

quantitating transcription of hSan or me reporter gene in the presence 
25 and absence of the test substance, wherein a test substance which increases 
transcription is a candidate drug for anti-neoplastic therapy- 
According to another embodiment a method of diagnosis of neoplasia is 
provided. The method comprises: 

contacting a tissue sample suspected of neoplasia isolated from a patient with 
30 an mammalian Say gene probe comprising at least 12 contiguous nucleotides of a 
sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 3, and 
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SEQ ID NO: 5, wherein a tissue which overexposes mammalian San or expresses a 
variant mammalian Son is categorized as neoplastic. 

In still another aspect of the invention a method of dysregulating cell growth is 
provided. The method comprises: 

5 contacting a cell- whose growth is controlled with a mammalian Scm 

polypeptide which comprises a sequence selected from the group consisting of SEQ ID 
NC: 2, SEQ ID NO:4, and SEQ ID NO: 6, whereby growth of the cell is 
dysregulated. 

• According .to m swther aspect of the invention a method of diagnosing 
10 neoplasia is provided. The method comprises: 

which specifically hybridize to an mammalian Jem gene sequence as shown in SEQ 



15 



determining -q&±^ fcVWbf matfmalian Jem RNA in the tissue by PCR 
: , ampUfication or bDNA ^detection, wherein higner level, of mammalian San 

.;^A.as compared ta'a\aorn^ hunaan tissue indicates neoplasia. " ' 
■ ■ v. "Also proVkied Wnuclefc^Vnolecules whWcan be used in' regulating a 
bcterptogdus coding sequent coo'rdinately with mamnialian Jcm. These sequences 
.20 .include the 5 '.untranslated region of an mammalian Scm gene; the 3' untranslated 
region of an mammalian San gene, the promoter region of an mammalian San gene, 
i and an intronof an mammalian Scm gene. 

Also provided by the present invention is a method of identifying modulators 
of mammalian Scm function comprising: 

25 contacting a mammalian celi which comprises an mammalian Scm gene 

or a reporter construct comprising an mammalian San promoter and a reporter gene 
•with a test substance; • 

quantitating transcription of mammalian Sew or the reporter gene in the 
presence and absence of the test substance, wherein a test substance which decreases 
30 transcription is a candidate drug for anti-neoplastic therapy. 
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The inventors have discovered a gene, the mammalian sex comb on midleg 
(mammalian San), that operates to regulate protein expression in mammals, 
particularly humans. Mammalian San may operate by controlling homeotic gene 
5 expression. Although the invention is not limited by any theory or mechanism of how 
the invention works, it is believed that control by this gene involves multiprotein 
complexes capable of negative regulation of tanssaodoa. 

The polypeptides of the invention, include ibe srlico var ;..«i polypeptides of 
SEQ ID NO: 2, SEQ ID NO: 4, and SEQ TJ> N0: f , , 7h ; rb con^: Afferent domains 
10 of the mammalian Son gene. The nucleic acid rrrlccuJey (SEQ ID NO: 1, SEQ ID 
NO: 3, and SEQ ID NO; 5) encoding the m»rvr.z]im S..ra polypeptides have been 
cloned from human cells. The polynucleotide rf SEQ n>NO: 1 encodes the 
polypeptide of SEQ ID NO: 2, the po«;muclec*id<» ofrSSQ © NO: 3 encodes the 
polypeptide of SEQ ID NO: 4, and in* j.cJyni.deofd . of SEQ ID NO: 5 encodes the 
15 polypeptide of SEQ ID NO: 6. Polyx* f^ec i ?irrtfid i? at least 6, 13, 2C, 30, 40, 
50, 54, 60, 65, or 75 amino acids of ?r : ™ra! : t-Scjv an ivsefulasSmmunogens for 
raising antibodies and as competitorr.ii' ii^crac&say;. . They can also be used to 
purify antibodies. Nucleic acid moleaiJes of at le?st !5, 20 5 30,48, at 50 contiguous 
nucleotides are useful as probes for use. in diagnostic assays. V v ; .;• 
20 Both human and murine Scm, and their coding sequences, are provided herein. 

There is a striking sequence conservation between murine and human Scm. They are 
99% similar at the nucleotide level,.and 97% identical at the amino acid level. The 
proline at position 20 in hScm is substituted. with a serine, and the tyrosine at position 
59 in hScm is substituted with a phenylalanine. Other mammalian Scm proteins and 
25 genes can be obtained.by screening of cDNA libraries af a mammalian species with a 
probe derivedirom the murine or human sequences. Such techniques are well known 
in the art, and can be employed by thos'j of skill in the art. ,- • • ■ 

The domains of mammalian Scm protein which appear to be most conserved 
are those found in the following locations in each of the isoforms of the human 
30 proteins. In isoform 1 (amino acid SEQ ID NO:4), the conserved domains are at aa 
1 to 80, aa 93 to 128, aa 135 to 142, aa 144 to 166, and aa 527 to 565. In addition 
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■he following short segment, ^ to ta ^ ^ ^ 

170 to 177, aa 261 10 aM aa 460 to 467. In isoform 2 (amino acid SEQ ID NO- 
6) the conserved dornains are: „ 20 , to 287, aa 311 to 336, aa 345 , 373, aa 550 to 
589, a, 625 to 710, a, 823 to 894, aa 940 to 984, and aa 2170 to 2210. In addition 
5 these shoner regions afj indicated as conserved: aa 446 to 45ij and aa 506 to 511 ta 
.soform 3 (amino acidSEQID NO: 2, the domains which appaarto be well conse™, 
«: aa 36 to 85, aa 6 * 120, a. 146 to ,7!, aa 186 to 208, and S 570 to 608 
• ^^^onaWliWyfcncdorc^ 
:, «*» when^ctirc modifications. In addition, these are most-useful h 
10 c*ta™ r other specie and isofonra of ton. 

H» hunm Scm gex ha* been mapped to chromosome ip34 This was 
,.• ^mpUshedUymB^ptng. M^^of^^^^** 
deflated gaste, oafcer aid for^ ^ ^ fa ^ ^ ' , 

15, acvaflo., or repn^^^ during development; Thus mammal™ Scm 
• can he rsed tha^Oc^, :o ^ ^ gax MpreKion ^ ^ ^ ^ 

P^TeofaceU. T««,^eiample, rnamn^Scmcanbeusedtodirw 
: ^««^»»f»P^toro^. SiMarly,Mbi„ M 
a dtffcremiafcd cell to heoor« less difterenda W , & . « ^ i« paaen, of ger* 

-0 expression. - . 

- -■ ■ P^erauve indications for which an mamn^ scm-based therapettuc age,, 
can be used include, reataoiis, benign p,^ iyfapla ^ ^ ^ 

' "*»■*»%, psoriasis, ieldds, arduMs, ^ ^ 
^ "cludtng fa ^pte, imestbal poij^ oeivica. dysplasia, and myeloid dysplasia. 

mcl«de, but are m limited to, lung carcinoma, colorectal adenocarcinoma, leukemia 
Burtitrs lymphoma and melanoma. 

The coding region of mammalian Son can be used for expression of 

■ ^^S™»"<«devdoprne»,ofmamnu^ 
Wlica«™,M^ 

of .Dsease or biotogical - disorder where overexpr^on of mammata Scm occurs 
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such as, for example, in cancers such as colorectal adenocarcinoma 

lymphatic cancer, promyelocyte leukemia, Burkitt's lymphoma, and myeloma T*e' 
5' untranslated and 3' untranslated regions of mammalian Scm can also be used 
diagnostically to the same effect as the mammalian San coding sequence for 
5 example, the .5' untranslated region can be isolated and ,sed, to probe tissue for 
example, lung tissue, where lung cardnoma is suspected, , Because mammalian San 
has been shown to be upregulated in Jung carcinoma, probing, wifcany p^, of ^ 
mammalian So, gene can identify, the upregu^n of mammalian -San in the tissue 
as an aid to making a diagnosis. Such diagnostic probes may also; be used for 
10 continued monitoring of a diag^^ tew^'wm*** after and 
during treatment, ^orjnd^ of^e.di^ase. 
Mam*^^ 

^^Z^oi^r^ S^^^n, genemic^NA with any 
P rob«-leng^ ^ ^ ^ 

rr ^^t **™ mW$WW # 11267,GMCC ,4737) 

has been deposited at the American Ty^ B^ RockyiUe, MD. The 

genomxc DNA can be subcl^ 
for sequencing and ^y^ 
n^malian ^ is useful 
20 protocol, and for further analysis of mammal|an San gene function and regulatory 
control. Knowledge of promoter region sequences specific for binding transcriptional 
activators that activate the mammalian Sen, promoter can facilitate improved 
expression of mammalian Son for therapeutic purposes. The mammalian San 
promoter region may be, useful for tissue. specific expressipn of heterologous genes 
25 such as, for treatment of lung carcinoma P r colorectal adenocarcinoma, The region 
immediately 5< of the codjng region of mammalian San can.be used, for example, as 
a diagnostic probe for cancer or a developmental disorder associated with aberrant 
mammalian Scm activity. The full lengtftgene, or such non-coding .regions of it as 
the promoter and the 5' or 3' untranslated regions, can be isolated by probing genomic 
30 DNA with a probe comprising at least about 12 nucleotides of mammalian Scm 
cDNA, and retrieving a genomic sequence that hybridizes to one of these sequences. 
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The 5' untranslated end and the promoter regions, for example, can be cloned by PCR 
cloning with random ougbnucleotide and a 5' portion of the known coding sequence. 

' The polypeptides of the invention can further be used to generate monoclonal 
or polyclonal antibodies. "Monoclonal antibodies, are prepared using the method of 
5 Kohler and Milstein, as described in Nature (1975) 256: 495-96, or a modification 
thereof. Antibodies to mammalian Scm, either polyclonal or monoclonal, can be used 
topically. They are' desirably compatible with the host to be treated. For 
■ example for treatment of humans, the antibodies can be human monoclonal antibodies 
' ' OI andbodics, as the term is generally known in the art. -Alternatively 

18 - single chain antiocdies may be used for therapy. Antibodies may act to antagonize 
or inhibit the polypeptide activity of mammalian Scm, and are also useful in 
- diagnosing a condition characterized by mammalian Scm expression or over- 
exp^bn^uch as, for example; a malignancy' condition. Similarly, underexpression 
can b* detected^ such antibodies bind specifically to mammalian Scm but not to 
15. other human pr^a. More prerened is the situation where the antibodies are human 
species inannnaliau Scm-specific . 

Expression of inammaliah Scm can be accomplished by any expression system 
appropriate for the purpose and conditions presented. Some exemplary expression 
systems are listed below. Where mammalian Scm itself is used as a therapeutic, the 
20 polypeptide can be expressed and subsequently administered to a patient. 
.- Alternatively a gene encoding at least a functional portion of mammalian Scm can be 
- ■ administered fo a Parent for expression in the patient. 

Recombinant mammalian Scm may be used as a reagent for diagnostic methods 
for diagnosis of cancer or a developmental disorder. It may also be used as a 
25 therapeutic for inducing differentiation in a population of progenitor cells. 

Recombinant mammaUan Scm can also be used to develop' modulators of mammalian 
Scm for achieving a desired therapeutic^ effect: Construction and expression of any of 
the recombinant molecules of the invention can be accompushed by any expression 
system most appropriate for the task, including, for example, an expression system 
30 described below. 
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Although the methodology described below is believed to contain sufficient 
details to enable one skilled in the art to practice the present invention, other 
constructs can be constructed and purified using standard recombinant DNA 
5 techniques as described in, for example, Sambrook et al. (1989), molecular 
cloning: a laboratory manual, 2nd ed. (Cold Spring Harbor Press, Cold Spring 
Harbor, New York); and under current regulations described in United States Dept. of 
Health and Human Services , National Institutes of Health (NTH) Guidelines for 
Recombinant DNA Research. Hie polypeptides of the invention can be expressed in 
10 any expression system, including, for example, bacterial, yeast, insect amebian and 
mammalian systems. Expression systems in bacteria include those described in Chang 
et dl. , Nature (1978) 275: 615, Goeddel et al.. Nature (19791 281: 5U, Goeddel et 
al, Nucleic Acids Res. (1980) 8: 4057, EP. 36,776, U.S. 4,551,433, deBoer e, al., 
Proc. Natl. Acad. Sci. USA (1983) 80: 21-25.. and SiebenJist^ o/., (iff (1980) 20. 
15 269. Expression systems in yeast include those described in Knnen etal. ,. Proc. 

Natl. Acad. Sci. USA (1978) 75: 1929; Jioetal.. J. Bacterial. (1983) .755: 163; Kurtz 
etal.. Mol. Cell. Biol. (1986) 6: 142; Krmzcetal., J. Basic Microbial. (1985)25: 
141; Gleeson et al.J. Gen. Microbiol (19*6) 752: 3459; Roggenkamp et al.. Mol. 
Gen. Genet. (1986) 202 :302) Das et al.. J. Bacterial. (1984) 758: 1165; De 
20 Louvencourt et al.J. Baaeriol (1983) 154: 737, Van den Berg 

Bio/Technology (1990) 8: 135; Kunze et al. J. Basic Microbiol. (1985) 25: 141; 
Cregg et al.. Mol. Cell. Biol. (1985) 5: 3376, U.S. 4,837,148, US 4,929,555; Beach 
and Nurse, Nature (1981) 300: 706; Davidow et al. Curr. Genet. (1985) 10: 380, 
Gaillardin**/., Curr. Genet. (1985) 70: 49, Ballanceer al.. Biochem. Biophys. Res. 
25 Common. (1983) 772: 284-289; Tilburn et al.. Gene (1983) 26: 205-221, Yelton et 
al.. Proc. Natl. Acad Sci. USA (1984)_«7i 1470-1474, Kelly and Hynes, EMBOJ. 
(1985) 4: 475479; EP 244,234, and WO 91/00357. Expression of heterologous genes 
in insects can be accomplished as described in U.S. 4,745,051, Friesen et al. (1986) 
" The Regulation of Baculovirus Gene Expression" in: THE molecular biology OF 
30 baculoviruses (W. Doerfler, ed.), EP 127,839, EP 155,476, and Vbker al. J. 
Gen. Virol (1988) 69: 765-776, Miller et al. . Ann. Rev. Microbiol (1988) 42: 177, 
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Carbonell et al. Gene (1988) 73: 409, Maeda et al, Nature (1985) 315: 592-594, 
Lebacq-Verheyden et al. Mol. Cell. Biol. (1988) 8: 3129; Smith etal.,Proc. Natl. 
Acad. Sci. USA (1985) 82: 8404, Miyajiraa et al.. Gene (1987) 58: 273; and Martin 
et al.,DNA (1988) 7:99. Numerous baculoviral strains and variants and 
5 corresponding permissive insect host cells from hosts are described in Luckow et al. , 
Bio/rechnology Xmi) 6: 47-55, Miller et al. in generic engineering (Setlow, J.K. 
et al eos.), Vol: 8 (Plenum Publishing, 1986), pp. 277-279, and Maeda et al.. 
Nature. (1985)Ji5: 592-594. Mammalian expression can be accomplished as 
described in Dykema et al., EMBOJ. (1985) 4: 761, Gorman etal.: Proc. Natl. 

10 Acad. Sci. USA (1982b) 79: 6777, Boshart etal.. Cell (1985) 41: 521 and U.S. 
4,399,216. Other features of mammalian expression can be facilitated as described in 
Ham and Wallace, rteth. Enz. (1979) 58: 44, Barnes and Sato, Anal. Biochem. (1980) 
102i 235, U.S. 4,767,704, US 4,657^866, US 4,927,762, US 4,560,655, WO 
90/103430; WO 87/00195, andul RE 30,985. 

15 Constructs mcludihg an mammalian Jew coding sequence or constructs 

including coding sequences lor modulators of mammalian San can be administered by 
a gene merapy protocol, either : localjy or systernicaUy. These constructs can utilize 
viral or hbri-viral vectors and can be delivered in vivo or ex vivo or in vitro. 
Expression of such coding sequence can be driven by endogenous mammalian or 
20 heterologous promoters. Expression of the coding sequence in vivo can be either 
constitutive or regulated. 

- ■ Gene deUvery vehicles (GDVs) are available for delivery of polynucleotides to 

cells, tissue, or to a the mammal for expression. For example, a polynucleotide sequence 
of the invention can be administered either locaUy or systemically in a GDV. These 

25 cojstructs an utilize viral or non-viral vector approaches in in vivo or ex vivo modality. 
Expression of such coding sequence can be induced using endogenous mammalian or 
heterologous promoters. Expression of the coding sequence in vivo can be either 
constitutive br' regulated. The invention includes gene delivery vehicles capable of 
expressing the contemplated polynucleotides. The gene delivery vehicle is preferably a 

30 viral vector and, more preferably, a retroviral, adenoviral, adeno-associated viral (AAV), 
hopes viral, or alphavirus vectors. The viral vector can also be an astrovirus,' 
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coronavuus, orthomyxovirus, papovavirus, param^ovirus, parvovirus, picomavirus 
poxv,rus, togavirus viral vector. See generally, Jolly, Cancer Gene Therapy l 5 i.<* 
(1994); Kimura, Human Gene Therapy 5:845-852 (1994), Connelly, Human Gene 
Therapy 6:185-193 (1995), and Kaplitt, Nature Genetics 6:148-153 (1994) Retrovind 
5 vectors are well known in the art and we contemplate that any retroviral gene therapy 
vector is employable in the invention, including B, C and D type retroviruses, xenotropic 
retroviruses (for example, NZB-X1, NZB-X2 and NZB9-1 (see O'Neii], J. Vir. 53:160 
1985) polytropic retroviruses (for example, MCF and MCF-MLV (see Kelly, J. vir' 
45:291, 1983), spu^viruses and lentiviruses, ~ , ~ <-; nor .Viruses, Second 
10 Edition, Cold Spring Harbor Laboratory, !985. . . 

Portions of the retroviral gene.ther^y v ^ ™ y b , * rived frum differem 
, retroviruses. For example, retroviral LTRs ma, < ved /rora , Murine Sarcoma 
Virus, a tRNA b^sto ftM ^s^ ^ ^ ^ 

Munne leukemia Virus, and .an. ;Ori,- cf s^nd strand s» from « Avian 
15 Leukosis Virus. These recombinarit re^nl vectors may be used to generate 
transduction competent retroviral vector parrcles by .introducing them tito appropriate 
packaging cell lines (see ^ ^■^•PW&UM Hnabn 29 1991) 
Retrovirus vectors can be constructed for site-specific integration into host cell DNA by 
incorporation of a chimeric integrase enzyme into the retroviral parucle See U S 
20 Serial No. 08/445,466 filed May 22, 1995. It is preferable that the recombinant viral 
vector is a replication defective recombinant vims. Packaging ceUlines suitable for use 
with the above-described retrovirus vectors are well known, in the are readily 
Prepared (see U.S. Serial No. 08/240,030, fded May 9, 1994; see also WO 92/05266), 
and can be used to create producer cell lines (also termed vector cell lines or "VCLs") 
25 fortheproductionofrecombiiiantvector particles. Preferably, the packaging cell lines 
are made from human parent cells (e.g., HT1080 cells) or mink parent cell lines, which 
eUminates inactivation in human serum. Preferred retroviruses for the construction of 
retroviral gene therapy vectors include Avian Leukosis Virus, Bovine 1 eukemia, Virus, 
Murine Uukenua yirus, Mink-Cell Focus-Inducing Virus, Murine Sarcoma Virus' 
30 Reticuloendothdiosis Virus and Rous Sarcoma Virus. Particularly preferred Murine 
Leukemia Viruses include 4070A and 1504A (Hartley and Rowe, J. Virol. 19- 19-25 
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1976), Abdson (ATCC No. VR-999), Friend (ATCC No. VR-245), Graffi, Gross 
(ATCC No: VR-590), iOrsten, Harvey Sarcoma Virus and Rauscher (ATCC No. 
VR-998) and Moloney Murine Leukemia Virus (ATCC No. VR-190). Such retroviruses 
may be obtained from depositories or collections such as the American Type Culture 
5 Collection ("ATCC") in Rockville, Maryland or isolated from known source using 
commonly available techniques. Exemplary known re^oviral gene therapy vectors 
ettployabiein lthls mvenfion include those described in GB 2200651- EP No 415 731- 
EPNo-^^^PCTPublicationNos. WO 89/02468, WO 89/05349 WO g^l' 
"^W « > 90,07930, WO 90/07936; WO 94/03622, WO 93/25698 WO 

10 93/25234, ^ 23^w6 93/10218, and WO 91/02805, in U.S. Patent Nos 
5;2t9>740< , mi 719) 4 980>289 ^ 4>?77 ^ ^ u s s ^ ^ 

G7/8<XWa*;_ : w Can^M 53:3860-3864 (1993); Vile, Cancer Res 53-962-967 
<i993); Ram/C^ncer 53:3*88 (1993); Takamiya, J. Neurosci. Res. 33-493-503 
0992); ■H+VJtoaHSZ 7**735 (1993); Mann, Cell 33:153 (1983)- Cane Proc 
, 15 J&ufcAcad Sema&im Miller, Human Gene Therapy 1 (1990) Human 
•^oviral gerie ^erapjP'vtms^are also known in the art and employable in this 
invention-See, ^^pie^ierkner, ^techniques 6:616 (1988), and Rosenfeld 
Science 252:431 ^ Wl)^and PCT Patent Publication Nos. WO 93/07283 WO 
93/06223, and- WO 93/07282. Exemplary known adenoviral gene therapy vectors 
20 employable in this invention include those described in the above-referenced documents 
and in PCT Patent Publication Nos. WO 94/12649, WO 93/03769, WO 93/19191 WO 
• - 94/28938,* WO 95/11984, WO 95/00655, WO95/27071, WO 95/29993, WO 95/34€71 
WO 96/05320, WO 94708026, WO 94/11506, WO 93/06223, WO 94/24299 WO 
95/14102, WO 95/24297, WO 95/02697, WO 94/28152, WO 94/24299 WO 95/09241 
25 WO 95/25807, WO 95/05835. WO 94/18922 and WO 95/09654. Alternatively' 
administration of DNA linked to killed adenovirus as described in Curiel, Hum. Gene 
Thcr. 3:147-154 (1992) may be employed. The gene delivery vehicles of the invention 
also iuclude adenovirus associated virus (AAV) vectors. Leading and preferred examples 
of such vectors for use in this invention are the AAV-2 basal vectors disclosed in 
30 Sriyastava, PCT Patent Publication No. WO 93/09239. Most preferred AAV vectors 
comprise the two AAV inverted terminal repeats in which the native D-sequences are 
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modified by substitution of nucleotides, such that af least 5>fMvc nucleotides and up to 
18 native nucleotides, preferably at least UO native nucleotides up to 18 native 
nucleotides, most preferably 10 native nucleotides are retained and the remaining 
nucleotides of the D-sequence are deleted or replaced with non-native nucleotides The 
5 native D-sequences of the AAV inverted terminal repeats are sequences of 20 consecutive 
nucleotides in each AAV inverted terminal repeat (i.e., there is one sequence at each end) 
which are not involved in HP formation. Thenon^atrve rephcfm^hucleotide may be 
any . nucleotide, other, than the ; nucleotide, foimd in lie native D^sr^ce in the same 
ppsitioji. Other employable exemplary AAV v^ ^ pWP.^^1, both of which 
10 ^^^mNahie^-Gene 124:257-262 (1993). Another exa.^ of Such an AAV 
vector js psub201 : See £arnulski, J: Virol. 61:3096 (1983), Aucthe* exemplary AAV 
.vector is the Double-pjTR.yector, ,flpw to « Ve. te Double D rm vector is disclosed 
in VS. Patent ^5^78,745,, m*? * vector* arc mose disclose in Carter U S 
PatemNo. 4/797,36*. and Jy^^ v s ^ 

15 . N^M 74 , 93 ^™*%^ Yet a further 

^example of an AAV vector, employa^wR tifci^entior is iSS V9AF4BTKneo, which 
; ^ ** A*?** WiW* -^nw Pre rndter.and ? directs express predominantiy 
m Olivet. Its structure anAhow. to jnat* ifcare. disclosed in Su,H*man Gene Therapy 
7:463^70 (1996). -Additional AAy gene therapy.Ve^rsaruLe^betJ m U.S Patent 
20 Nos.5,354,678;.5,173,414; 5,139,941; and 5,252,47*. The ^therapy vectors of 
the invention also include herpes sectors. L^g ^ preferred, examples are herpes 
simplex virus vectors containing a sequence encoding a thymidine kinase polypeptide 
, such as those disclosed in U.S. Patent No: 5^288,641 and EP No, : 176,170 (Roizman). 
Additional exemplary herpes simplex virus vectors include HFEM/lCPo-LacZ disclosed 
25 in PCT Patent No. WO 95/04139 (Wistar Institute), pHSVlac described in Geller, 
Science 241:1667-1669 (1988) and inPCT Patent Publication Nos. WO 90/09441 and 
WO 92/07945, HSV Us3::pgC-lacZidescribed in Hnk, Human Gene Therapy 3:11-19 
(1992) and HSV 7134, 2 RH 105 and GAL4 described in EP No. 453,242 (Breakefield), 
and those deposited with the ATCC as accession numbers ATCC VR-977 and ATCC 
30 VR-260. Alpha virus gene .therapy vectors may. be employed in this invention. 
Preferred alpha virus vectors are Sindbis viruses vectors. Togaviruses, Semliki Forest 
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virus (ATCC VR-67; ATCC VR-1247), Middleberg virus (ATCC VR-370), Ross River 
virus (ATCC VR-373; ATCC VR-1246), Venezuelan equine encephalitis virus (ATCC 
VR923; ATCC VR-1250; ATCC VR-1249; ATCC VR-532), and those described U.S. 
Patau Nos. 5,091,309 aid 5,217,879, and PCT Patent Publication No. WO 92/10578. 
5 More particularly, those alpha virus vectors described in U.S. Serial No. 08/405,627, 
filed March 15, 1995, and U.S. Serial No. 08/198,450 and in PCT Patent Publication 
Nc- WC 94/21792, WC 92/10578, and WO 95/07994, and U.S. Patent Nos. 5,091,309 
^ 5,;:i7,37T ar.- employable. Such alpha viruses may be obtained from depositories 
c- c. - - -ons Such as .tte ATCC in Rockvflie, Maryland or isolated from known sources 
10 u/ grcommonly available techniques. Prefeiably, alphavirus vectors with reduced 
,c : idtoxicity are used <see co-owned U.S. Serial No. 08/679640). DNA vector systems 
s-xfr as eokaryotie layeiatfflcprefebn sysfcms .re also useful for expressing the nucleic 
r- : .ds o&thc invention. See DCT Patent Publication No. WO 95/07994 for a detailed 
desGiptiOii cf eiil-aiy^ layerrf -exp^essiu.. systems. Prefeiably, the eukaryotic layered 
15 ^«*«»' system of^^ ^vehtioa art deiived from alphavirus vectors and most 
IJnrferaW, .^Siiidbir. vial ;ecUa: Other viral vectors suitable for use in the present 
-'' 1 " erfon include tuose isived Aom pol-ovirus, for example ATCC VR-58 and those 
des^ibcr' fc Svans; Nature 335:385 (1989); arid Sabin, J. Biol. Standardization 1:115 
<1573): .Jiiwvirus, for example ATCC VR-iii0 and ihose described in Arnold, J Cell 
20 .Bic^.m (1990) L401; pox viruses such as canary pox virus or vaccinia virus, for 
example ATCC VR-111 tad ATCC VR-2010 and those described in Fisher-Hoch, Proc 
M Natl: Acad Sci 86 (1989) 317, Flexner, Aroi NY Acad Sci 569:86 (1989), Flexner, 
Vaccine 8:17 (1990); ia U.S. Patent Nos. 4,603,112 and 4,769,330 and in WO 
89/01973; SV40 vims, for example ATCC VR-305 and those described in Mulligan, 
25 Nature,277:108 (1979) and Madzak, J Gen Vix 73:1533 (1992); influenza virus, for 
example ATCC VR-797 and recombinant influenza viruses made employing reverse 
genetics techniques as described in U.S. Patent No. :5, 166,057 and in Enami, Proc. 
Natl. Acad. Sci: 87:3802-3805 (1990); Enami and Palesc, J. Virol. 65:2711-2713 
(1991); and Luyrjes, Cell 59:110 (1989), (see also McMicheal. , New England J. Med. 
30 309:13 (1983), and Yap, Nature 273:238 (1978) and Nature 277:108, 1979); human 
immunodeficiency virus as described in EP No. 386,882 and iii Buchschacher, J. Vir. 



15 



WO 97/42211 



PCT/US97/07575 



66:2731 (1992); measles virus, for example, ATCC VR-67 and VR-1247 and those 
described in EP No. 440,219; Aura virus, for example, ATCC VR-368; Bebaru virus, 
for example, ATCC VR-600 and ATCC VR-1240; Cabassou virus, for example, ATCC 
VR-922; Chikungunya virus, for example, ATCC VR-64 and ATCC VR-1241; Fort 
5 Morgan Virus, for example, ATCC VR-924; Getah virus, for example, ATCC VR-369 
and ATCC VR-1243; Kyzylagach virus, for example, ATCC VR-927; Mayaro virus, 
for example, ATCC VR-66; Mucambo virus, for example, ATCC VR-580 and ATCC 
VR-1244; Ndumu virus, for example, ATCC VR-371; Pixuna virus, for example, 
ATCC VR-372 and ATCC VR-1245; Tonate virus, for example, ATCC VR-925- 

10 Triniti virus, for example ATCC VR-469; Una virus, for example, ATCC VR-374; 
Whataroa virus, for example ATCC VR-926; Y-62-33 virus, for, example, ATCC 
VR-375; O'Nyong virus, Eastern encephalitis virus, for example, ATCC VR-65 and 
ATCC VR-1242; Western encephalitis virus, for example, ,ATCC VR-70, ATCC 
VR-1251, ATCC VR-622 and ATCC VR-1252; and coronavirus,,for example, ATCC 

15 VR-740 and those described in Hamre, Proc. Sqc.. Exp. Bip^. Med. 121:190 (1966). 
Delivery of the compositions of this invention into cells is not limited to the above 

i i • . ■ ■ . . . • • , ' ■ 

mentioned viral vectors. Other delivery methods and media may be employed such as, 
for example, nucleic acid expression vectors, polycationic condensed DNA linked or 
unlinked to killed adenovirus alone, for example see U.S. Serial No, OjS/366,787, filed 

20 December 30, 1994, and Curiel, Hum Gene Ther 3:147-154 (1992) .ligand linked DNA, 
for example, see Wu, J. Biol. Chem. 264: 16985-16987 (1989), eukaryotic cell delivery 
vehicles cells, for example see U.S. Serial No. 08/240,030, filed May 9, 1994, and 
U.S. Serial No. 08/404,796, deposition of photppolymerized hydrogel materials, 
hand-held gene transfer particle gun, as described in U.S. Patent No. 5,149,655, 

25 ionizing radiation as described in U.S. Patent No. 5,206,152 and in PCT Patent 
Publication No. WO 92/11033, nucleic charge neutralization or fusion with cell 
membranes. Additional approaches are described in Philip, Mol. Cell. BioL 
14:2411-2418 (1994) and in Woffendin, Proc. Nad. Acad. Sci. 91; 1581-585 (1994). 
Particle mediated gene transfer may be employed, for example see U.S. provisional 

30 application No. 60/023,867. Briefly, the sequence can be inserted into conventional 
vectors that contain conventional control sequences for high level expression, and then 
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be incubated with synthetic gene transfer molecules such as polymeric DNA-binding 
cations like polylysine, protamine, and albumin, linked to cell targeting ligands such as 
asialoomsomucoid, as described in Wu and Wu, J. Biol. Chem. 262:4429^432 (1987) 
insulin as described in Hucked, Biochem. Pharmacol. 40:253-263 (1990), galactose as 
5 described in Plank, Bicconjugate Chem 3:533-539 (1992), lactose or transferrin. Naked 
DNA may also be employed. Exemplary naked DNA introduction methods are 
described in PCT Patent Publication No. WO 90/11092 and U.S. Patent No. 5,580,859 
"... U ^^ C ^. may 1x5 ™Z biodegradable latex beads.. DNA coated latex 

beaoi are efficiently transported into cells after endocytosis imtiauWby the beads. The 
10 method may be improved further by treatment of the beads to increase hydrophobicity 
' and thereby facilitate disruption of the endosome and release of the DNA into the 
, V ?^ 1as ' m ;;. thai' can;"act as gene^very'veMcte are .described in U.S. 

PafcritNo. 5,422,120, PCT Patent PubUcation Nos. WO 95/13796, WO 94/23697 and 
WO 91/144445, and EP' No. 524,968. As described in coined U.S. provisional 
15 ' application No. 60/023,867, on non-viral delivery, the nucleic acid sequences can be 
inserted into conventional ' vectors that contain conventional control sequences for high 
level expression, and then be mcubated with synthetic gene transfer molecules such as 
polymeric DNA-6inding cations like polylysine, protamine, and albumin, linked to cell 
targeting ligands such as asialoorosomucoid, insulin, galactose, lactose, or transferrin. 



20 Other 



delivery systems include the use of Uposomes to encapsulate DNA comprising the gene 
' under the control of a variety of tissue-specific or ubiquitously-active promoters. 
Further non-viral delivery suitable for use includes mechanical delivery systems such as 
the approach described in Woffendin et al., Proc. Nati. Acad. Sci. USA 
25 91(24):11581-I1585 (1^). Moreover, me coding sequence and the product of 
expression of such can be delivered through deposition of photopolymerized hydrogel 
materials. Other conventional methods for gene delivery that can be used for delivery 
of the coding sequence include, for example, use of hand-held gene transfer particle gun 
as described in U.S. Patent No. 5,149,655; use of ionizing radiation for activating 
30 transferred gene, as described in U.S. Patent No. 5,206,152 and PCT Patent Publication 
No. WO 92/11033. Exemplary liposome and polycationic gene delivery vehicles are 



17 



WO 97/42211 



PCT/US97/07575 



those described in U.S. Patent Nos. 5,422,120 and 4,762,915, in PCT Patent 
Publication Nos. WO 95/13796, WO 94/23697, and WO 91/14445, in EP No. 524,968 
and in Stryer, Biochemistry, pages 236-240 (1975) W.H. Freeman, San Francisco, 
Szoka, Biochem. Biophys. Acta. 600:1 (1980); Bayer, Biochem. Biophys. Acta. 550:464 
5 (1979); Rivnay, Meth. Enzymol. 149:119 (1987); Wang, Proc. Natl. Acad. Sci. 84:7851 
(1987); and Plant, Anal. Biochem. 176:420 (1989). . 

Test compounds can be tested as candidate modulators by-^ting the ability to 
increase or decrease the expression of mammauan Scm. The candidate modulators can 
be derived from any of the various possible sources of candidates,, soph as for example, 
L0 libraries pf peptides, peptoids, . small . molecules, polypeptides, antibodies, 
polynucleotides, small molecules, antisense, molecules, ribozymes, cRNA cDNA 
polypeptides presented by phage display. Pe^ribed. below, are some, exemplary and 
possible sources of candidates,, including syntfae^-libraries of peptides, peptoids, and 
small molecules. The exemplary expression, system^c^n be used to generate cRNA or 
5 cDNA libraries that can also be screen^ fpr.th^ a^ modulate mammalian Scm 
activity or expression. C^did^te.rnplecules . screened for the ability to agonize 
mammalian Scm expression or activity may be use&V for inducmg differentiation in a 
population of pxDgcn^cdl^. ,^|qa^ molecules can be screened for the ability to either 
affect mammalian Scm expression, or affect mammalian Scm.ftnctk>B,by enhancing or 
0 interfering in mammalian Son's ability to interact with other molecules that mammalian 
Scm normally interacts with in mammalian Son's normal function. 

Mammalian Scm peptide modulators are screened using any available method. 
The assay conditions ideally should resemble the conditions under which the 
pammalian Scm modulation is exhibited in vivo, that is,.yncjer physiologic pH, 
temperature, ionic strength, etc. Suitable antagonists will exhibit strong inhibition of 
mammalian Scm expression or activity at concentrations that do not cause toxic side 
effects in the subject. A further alternative agent that can be us^d hercip as a 
modulator of mammalian Scm is a small molecule antagonist Small molecules can be 
designed and screened from, a pool pf synthetic candidates for ability to modulate 
mammalian Scm. There exist, a wide variety of small molecules, including peptide 
analogs and derivatives, that can act as inhibitors of proteins and polypeptides. 
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Libraries of these molecules can be screened for those compounds that inhibit the 
activity or expression of mammalian Scm. Similarly, ribozymes can be screened in 
assays appropriate for ribozymes, taking into account the special biological or 
biochemical nature of ribozymes. Assays for affecting mammalian Scm expression 
can measure mammalian Scm message or protein directly, or can measure a reporter 
gene expression which is under the control of an mammalian Scm promoter and/or 5' 
untranslated region (UTR). 



: i: tff /:' 



Mammalian Scm or a modulator of mammalian Scm can be administered to a 
' parent exhibiting a condition characterized by abnormal cell proHferation, in which 
10 /aberrarit nrcmmaaan San gene expression is implicated, particularly excessive 
' niariimalian Scm activity, or excessive activity controlled or induced by mammalian 
Scm activity. The modulator can be incorporated into a pharmaceutical composition 
- that include* a prkiinaceuucally acceptable carrier for the modulator. Suitable 
- carriers may be large; sl6wly"metab6iized macromolecules such as proteins 
15 l^ysaccr^ ^ ^ ^ 

acid copolymers, and ma^e p^cles. Such carriers are well known to those 
• ■ ; of ordinary skmrn ^ ^ 

example, mineral acid salis such as hydrtchlorides, hydrbbrdrnides, phosphates, 
sul^; and the like; and the salts of organic' acids such as acetates, propionates 

20 malonates, benzoates, and th* like. A thorough discussion of pharmaceutically ' 
acceptable excipients is available 1 in REMINGTON'S PHARMACEUTICAL 
SCIENCES (Mack Pub. Co.; N.J. 1991). Pharmaceutically acceptable carriers in 
therapeuticcompositions may contain liquids such as water, saline, glycerol and 
ethanol. Additionally, auxiliary substances, such as wetting or emulsifying agents 

25 pH buffering substances, and the like, may be present in such vehicles. Typically, the 
therapeutic compositions are prepared as injectables, either as liquid solutions or 
suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles 
prior to injection may also be prepared. 

Liposomes are included within the definition of a pharmaceutically acceptable 
30 carrier. The term "Uposomes" refers to, for example, the liposome compositions 
' described in U.S. Patent NO: 5,422,120, WO 95/13796, WO 94/23697, WO 
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91/14445 and EP 524,968 Bl. Liposomes may be pharmaceutical carriers for the 
peptides, polypeptides or polynucleotides of the invention, or for combination of these 
therapeutics. 

Any therapeutic of the invention, including, for example, polynucleotides for 
5 expression in the patient, or ribozymes or antisense oligonucleotide, can be formulated 
into an enteric coated tablet or gel capsule according to known methods in the art. 
These are described in the following patents' US 4,853,230, EP 225,189, AU 
9,224,296, AU 9,230,801, and WO 92144,52. Such a capsule is rdmnistered orally 
to be targeted to the jejunum. At 1 to 4 days fblkwii? oral adr ^station expression 

10 of the polypeptide, or inhibition of expression by, for ex?mrb a rihozym „ x>r an 
antisense oligonucleotide, is measured in the ?lx$mz and blood, for sample by 
antibodies to the expressed or non-expressed ^mHrr* 

Administration of a therapeutic agent of tht invention . incivdir g for example 
an mammalian Scm modulator, mclrdes adwrJstsnjDg a. therapeutically effective dose 

15 of the therapeutic agent by a meaa* consider^ or empirically deduced to be effective 
for inducing the desired effect in the. pa^enf Both fhf dose and the. administration 
means can be determined b?^sed pn the specific -qualities of the rher?peitfic, the 
conditioi? of the patient, the progression of the disease, and other relevant factors. 
Administration of the therapeutic agents of the inventor, can ;ncluds, Jocal or 

20 systemic administration, including injection, oral administration, vz -tf.de gun or 
catheterized administration, and topical administration. The therapeutics of the 
invention can be administered in a therapeutically effective dosage and amount, in the 
process of a therapeutically effective protocol for treatment of the patient. The initial 
and any subsequent dosages administered will depend upon the patient's age, weight, 

25 condition, and the disease, disorder or biological condition being treated. Depending 
on the therapeutic, the dosage and protocol for administration will vary, and the 
dosage will also depend on the method of administration selected, for example, local 

or systemic administration. 

a. 

For polypeptide therapeutics, for example, a dominant negative mam malian 
30 Scm polypeptide or. a polypeptide modulator of mammalian Scm, the dosage can be in 
the range of about 5 /tg to about 50 /xg/kg of patient body weight, also about 50 /tg to 
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about 5 mg/kg, also about 100 M g to about 500 pg/kg of patient body weight, and 
about 200 to about 250 fig/kg. 

For polynucleotide therapeutics, depending on the expression of the 
polynucleotide in the patient, for tissue targeted administration, vectors containing 
5 expressible constructs including mammalian Son coding sequences or modulator 
coding sequences, or non-coding sequences can be administered in a range of about 
100 ng u about 200 mg of DNA for local administration in a gene therapy protocol, 
•« 1 2l» about 500 ng to about 50 mg, also about 1 ug to about 2 mg of DNA, about 5 ug 
y/ LMA tc about 5X iig of DNA, and about 20 ug to about 100 ugMuring a local 
10 afanrnterdioa in & gene therapy protocol, and for example, a dosage of about 500 ug, 
per injection or administiation. 

Non-coding ^ seqiiiehces that aa oy a catalytic mechanism, for example, 
t^yticaUy active ifbozyir.es may'^uifelower doses than non-coding sequences that 
^ hcM m:*& tfsimm bf stelchiometry, as i;. the case of, for example, antisense 
.1.5 • «^1<^!«, altfoMgli d^fcs^r fiiiteatfbns bf the nbbzymes may again .aise the 
dosage requirements' of nbozyrhes being expressed in ro k order that they achieve 
'■-effiocy ii. ihe patieat; Factors such as method of action and efficacy of 
' tnmsformitfoh and expression are therefore considerations that will effect the dosage 
• > required fo: ultimate efficacy for DNA and nucleic acids. Where greater expression 
20 is (fesired, aver a larger area of tissue, larger amounts of DNA or the same amounts 
Teadmiiusteredka^uc^ 

to different adjacent ordosen^e iwrtions bt for example, a tu maybe 
required to effect a positive therapeutic outcome. 

For administration of smaU molecule modulators 'of mammalian Scm 

25 polypeptide activity, depending on the potency of the small molecule, the dosage may 
vary. For a very potent inhibitor, microgram 6*g) amounts per kilogram of patient 
may be sufficient, for example, in the range of about 1 jig/kg to about 500 mg/kg of 
patient weight, and about 100 M g/kg to about 5 mg/kg, arid about 1 M g/kg to about 50 
/<g/kg, and, for example, about 10 ug/kg. For adirdnistratiori of peptides and peptoids 

30 the potency also affects the dosage, and may be in the range of about 1 M g/kg to about 
500 mg/kg of patient weight, and about 100 /xg/kg to about 5 mg/kg, and about 1 
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Mg/kg to about 50 ng/kg, and a usual dose might beabout 10 ug/kg. 

In all cases, routine experimentation in clinical trials will determine specific 
ranges for optimal therapeutic effect, for each therapeutic, each administrative 
protocol, and administration to specific patients will, also be adjusted to within 
5 effective and safe ranges depending on the patient condition and: responsiveness to 
initial administrations. „•, • 

Administration of a therapeutic agent for a condition dn Which increased 
expression of mammalian Scm is implicated,).** example,* in the case of 
promyelocyte leukemic, chronic myelogenous leukemia, lymphoblastic leukemia, 
10 Burkina's lymphoma, colorectal adenocarcinoma, lungicarcinonla, melanoma, and 
lymphoma, can be preceded by diagnosis of the: condilio* using an mammalian San 
.probe, generated from any portion of the maiamalian &w gene; and pabing the 
suspect tissue. bDNA technology usingbD^ probe* to mammalian Scm gene 
sequences or mammalian Scm nijRl^A .sequences . my/beusedy as described in WO 
15 92/02526 or U.S. 5,451,503, and y.S. 4/775,619. ... 

diagnosis is complete, , ^re^^t pan Include administration of 
mammalian Son r<)ly^ucleoti(5es,oran,ti.sense oligonucleotide by a g ene therapy 
protocol, or by administration, by other means including local or systemic 
.administration, of an mammalian Scm modulatory for example an mammalian San- 
20 specific ribozyme, or a genetically altered mammalian Scm variant, foi example a 
dominant negative mammalian Son, or a small molecule or peptide or peptoid 
mammalian Scm modulator, or any combination of these potential therapeutics. The 
patient can be subsequently monitored by periodic reprobing of the affected tissue 
with an mammalian Scm probe. 

25 Even in cancers where mammalian Scro.mutations are not implicated, 

mammalian Scm upregulation or enhancement of mammalian Scm function may have 
therapeutic application. In these cancers, increasing mammalian Scm expression or 
enhancing mammalian Scm function may help to suppress the tumors. Similarly, even 
in tumors where raamnvlian Scm expression is not aberrant, effecting mammalian 

30 Scm upregulation or augmentation of mammalian Scm activity may suppress 
metastases. 
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Further objects, features, and advantages of the present invention will become 
apparent from the detailed description. It should be understood, however, that the 
detailed description, while indicating preferred embodiments of the invention, is given 
by way of illustration only, since various changes and modifications within the spirit 
5 and scope of toe invention will become apparent to those skilled in the art from this 
detailed description, 
Dfifisujtians 

A "nucleic acid molecule* or a "polynucleotide,- as used herein, refers to either 
mi pj.PNA molecule thafi-encodes a specific amino acid sequence™ its 
10 , W lemen ta ry ..st 1 , :d. Nucleic acid molecules may also be non-coding sequences, 
: ..far.exampJe, z rib*yme> an antisense oligbnucleotide, or an untranslated portion of a 
.;, faic. • A "coding sequence" as used herein; refers to either RNA of DNA that encodes 
a specific amino aci^seqaenc^ its complementary strand: A polynucleotide may 
delude, for ^?^; an afctisen* oligonucleotide, or a ribozyme, and may also 
15 include such items as i :^ br 5 untranslated region of a gene, or an intron of a gene, 
or other region of agene that flocS not make up the coding region of the gene. The 
DNA,or SNA may bfr smgie strano^ or double stranded. Synthetic nucleic acids or 
synthetic polynucleotides can be chemically synthesized nucleic acid sequences, and 
ray also be modified with chemical moieties to render the molecule resistant to 
20 degradatidn. Synthetic nucleic acids can be ribozymes or antisense molecules, for 
^example. Modifications to syndetic nucleic acid molecules include nucleic acid 
,,„ mcnomersor derivative or modifications thereof, including chemical moieties. For 
example, phosphothioates can be used for the modification. A polynucleotide 
derivative can include, for example, such polynucleotides as branched DNA (bDNA). 
25 A polynucleotide can be a synthetic or recombinant polynucleotide, and can be 
generated, for example, by polymerase chain reaction (PCR) amplification, or 
recombinam expression of complementary DNA or RNA, or by chemical synthesis. 
Mammalian Sent polynucleotides contain at least 95% and preferably at least 97% 
identity to either mouse or huinan/^i sequences. These can be obtained, imeraUa 
30 by hybridization of mouse or human San probes under conditions of stringent 

hybridization. Encompassed within the definition of mamnialian, humai,, and mouse 
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San are sequences which contain allelic variants, as well as sequences which differ 
due to the degeneracy of the genetic code. 

The term -functional portion of as used herein refers to a portion of an 
mammalian Scm wild-type molecule which .retains at least 50% of activity of 
5 mammalian Scm. It also encompasses a portion of an mammalian 5on gene having 
single base substitutions, deletions, or insertions thathaye no adverse effect on the 
activity of the molecule. . Truncations of .mammalian Scm, fragment, of Scm, and 
comb^ons of fragment are 

contemphued. ? Such pprtio^ of hScm.may al^ be fi^ to^pt^teins, such as in 
10 a gene fusion. 

The term -functional" as used herein refers to,a gene..fur*tipn&l in cancer or 

is functional if jts^pression cau^, dirxtly or 
^^y> ^^^cmy associated y^h differ^tiatioai, mitosis, oncogenesis, 



15 



metastasis, or the like. 



The term !*«?^,as.^^^ nolecule to alter 

the function or expression of another moleci^, .Thus, modulate .could mean, for 
example, inhibit, antagoiu^,.agpnwe r upregulate, ,4Qwn?egulate, induce, or suppress. 
A modulator has <ar^bility of altering; function of* its targek-..,Such alteration can 
. be accomplished at any stage of the transcription, translation,^ expression or function 
2 ? , „ of the protein, so that, for example, modulation of mammalian Scrrrean 

accomplished by modulation of the DNA, RNA, and protein products of the gene. It 
assumed that modulation of the function of the target, for example, mammalian Scm, 
will in turn modulate, alter, or affect the function or pathways leading to a function of 
,genes and proteins that would otherwise associate, and interact, or respond to, 
25 mammalian Scm. 

A "matignancy" includes any prohferative disorder in which the cells 
protiferating are ultimately harmful to the host. Cancer is an example of a 
proliferative disorder that manifests a maligtancy. Neoplasia is the state of cells 
which experience uncontrolled cell growth, whether or not malignant; 
30 The term "regulatory sequence" as used herein refers to a nucleic acid 

sequence encoding one or more elements that are capable of affecting or effecting 
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expression of a gene sequence, including transcription or translation thereof, when the 
gene sequence is placed in such a position as to subject it to the control thereof. Such 
a regulatory sequence can be, for example, a minimal promoter sequence, a complete 
promoter sequence, an enhancer sequence, an upstream activation sequence ("UAS"), 
5 an operator sequence, a downstream termination sequence, a polyadenylation 
sequence, an optimal 5' leader sequence to optimize initiation of translation, and a 
Shine- Dalgarno sequence. Alternatively, the regulatory sequence can contain a 
comtiation -.liancer/promoter element. The regulatory sequence that is appropriate 
foi e^cosiori ok die present construct differs depending upon the host system in 
10 which the construct is to be expressed. Selection of the appropriate regulatory 
jequences lor uie herein is within the capability of one skilled in the art. For 
or-ample, in proidaYyc^s, such a regulatory sequence can include one or more of a 
p-x>mctev seqaunccV a rioiso.nal Eroding site, arid a transcription termination 
sequence. In eukaryotes, for example, such a sequence can include one or more of a 

■■■IS' pn^ter^u!^^ If any necessary 

component o/ a itgulatOiy sequence that is needed for expression is lacking in the 
pblynudeciid;: constrao;'such ^component caa be supplied by a vector into which 
ths : pol>tociecade Construct ian be mserted ibr expression. Regulatory sequences 
suitable for use herein may be derived from toy source ^ including a prokaryotic 

20 .ource, in eukaiyotic source, a virus; a viral vector, a bacteriophage or from a linear 

: ■ or circular plasmid. An example of a regulatory sequence is the human 

immunodeficiency virus ("HIV) promoter 'that is located in the U3 and R region of 
the HIV long terminal repeat ("LTR"). Alternatively, the regulatory sequence herein 
can be a synthetic sequence, for example, one made by combining the UAS of one 

25 gene with the remainder of a requisite promoter from another gene, such as the 
GADP/AM2 hybrid promoter. 

The terms 'protein", "polypeptide", "polypeptide derivatives" and modifications 
and variants thereof refer herein to the expression product of a polynucleotide 
construct of the invention as defined above. The terms further include truncations, 

30 variants, alleles, analogs and derivatives thereof. Unless specifically mentioned 
otherwise, such mammalian Scm polypeptides possess one or more of the bioactivities 
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of the mammalian Scm protein, such as those discovered herein. This term is not 
limited to a specific length of the product of the mammalian Son gene. Thus, 
polypeptides that are identical or contain at least 85 %, and more preferably 90%, and 
most preferably 95% identity with the mammalian Scm protein or the mature 
5 mammalian Scm protein, wherever derived, from human or nonhuman sources are 
included within this definition of the mammalian Scm .polypeptide. Also included, 
therefore, are alleles and variants of the product of ;the mammalian Scm gene that 
contain amino acid substitutions, deletions, ob insertions, ^The amino .acid 
substitutions can be conservative amino acid substitutions or, substitutions to etiininate 
10 non-essential amino acid residues such as to alter a glycosyfction site, a 

phosphorylation site, an acetylation site, or to alter the folding pattern by altering the 
position of the cysteine residue that is no; necessary for function, etc. Conservative 
amino acid substitutions are those that Preserve .the gxjnenu\charge, 
hydrophobicity/hydrophiUcity and/or stencbulk of ^amino.acid substituted, for 
example, substitutions between the me;nbw Q^tiie-fcUowing groups ars conservative 
substitutions: Gly/Ala, Val/Ile/Uu y Asp/Glu.^ys/Argi Asn/Gln, Ser/Thr/Cys and 

peptides H^ng- one or ,more peptidemimics, also 
known as peptoids, that possess mammalu^ Srm prof ein-like activity. Included 
within the definition are, for example, polypeptides containing one onmore analogs of 
20 an amino acid (including, for example, unnatural amino acids,, etc.),: polypeptides 
with substituted linkages, as well as other modifications known in the art, both 
naturally occurring and nonnaturally occurring. The term "inainmalian Scm" also may 
include post-expression modifications of the polypeptide, for example, glycosylations, 
acetylations, phosphorylations, myrsrylations, famesylations, palmitoylations and the 
25 like. 

The term "polypeptide fragment" as used herein refers to a polypeptide 
sequence that does not encode.the full length of a protein bui that is identical to a 
region of the protein. The fragment is designed to retain the functional aspect of the 
region of the polypeptide from which it is derived. Two fragments can cooperate to 

polypeptide fragments of the same gene may represent 
expressed splice variants of that gene, although functionality and expression of the 
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polypeptide space variant prod^ my ^ jn similai 
may be related, at least in pan, fa ftaoion. 

^^•«vaave-a 5ns edhe re fafa refOT1Ketoapoln)epddeori 
fconde means , p,,^, „ ^ ^ ^ 

-y be various,, ^ by nlKlMa(le „ ^ ^ ddeii<)ns 

= -** acd mc ^, of , or W 

^cr^^W^^^. faanycase,aderivadve 

^ .*?!" * - ^ - *<* * -c*. of the poiypeptide 

xrom wlxiviii u is derived. 

produced in W» or * w™ fa , 

>5. b^. b,^^^.^^ F.exan.pte.anisoh.edpcype^ecanbe 
iWaad re^tory^uene* fer ^ ^ ^ ^ 

20 •^^ •A'^-M^ * "Mted- forpur^ Z to * 

~ .^ple such «*« pofypipdde, w po.ynucieot.des A ^ M% 

OT " hgh "^» f »chas5058, 75%, 85% OT 90» 

b"*S». con*,. ^ any ^ „ ^ ^ ^ fc 

«*a> uvea*,, include, on. ia no, iinuted the followinf organisms m 
™»*, n»nnn*, huraans, and vertebrates. A biologica. condition can include, for 
30 Zntl," OT * *• * « ™W no, be character^ by 

P^fcrauon is most Htel, , cancer condition, .nay ai» be a condinon arising fa 
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the development of an organism. 

The term "modulator" as used herein describes any moiety capable of changing 
the endogenous activity or a polypeptide. Modulatory activities can include, for 
example, modulation at the level of transcription, translation, expression, secretion, or 
5 modulation of polypeptide activity inside or outside a cell. Modulation can include, 
for example, inhibition, antagonism, and agonism, and modulation can include, for 
example, modulation of upstream or downstream effscts thate'toci the ultimate 
activities in a pathway, or modulation of the configuration of r ^ypeptide such that 
its activity is altered. Modulation can he transitory or perranen :, aud may be a dose 
10 dependent effect. 

The term "inhibitor" for use herein can be any inhibitor cf a polypeptide 
activity. The category includes but is not E«rxi^d to any of die herein described 
antagonists of niamrnalian Son. The nhiMtor of mamnalian Son can be an antibody- 
based mammalian Scm antagonist, c-r a polypeptijfo frrgrocnt thereof, a peptide 
15 mammalian Scm antagonist, a peptoid namm-lian. 5cm, antagonist, or a small 

molecule mammalian Scm antagonist,. Tfee polypeptide inhibitor can be one screened 
from a cDNA, cRNA, or phage display Jibiyr/ of polypeptides; • The inhibitor can be 
a polynucleotide, such as, for example a. nr^zyme or an^antisense ligonucleotide. or 
can be derivatives of these. It is expected Jhat some inhibitors will act at 
20 transcription, some at translation, and some on the mature proteir.. > However, the use 
and appropriateness of such inhibitors of mammalian Scm for the purposes of the 
invention are not limited to any theories of mechanism of action of the inhibitor. It is 
sufficient for purposes of the invention that an inhibitor inhibit the activity of 
mammalian Scm. • 

25 The term "antagonist" as used herein refers to a molecule that inhibits or blocks 

the activity of a polypeptide, either by blocking the polypeptide itself, or by causing a 
reduced expression of the polypeptide by either blocking transcription of the gene 
encoding the polypeptide, or py interfering with or ; destroying a transcription or 
translation product of the genie. An antagonist may be, for example, a small 

30 molecule, peptide, peptqid, polypeptide, or polynucleotide. The polynucleotide may 
be, for example, a ribozyme, an antisense oligonucleotide, or a coding sequence. 
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Tit term -agonist- as used herein refers to a molecule Om mimics me activity 
of me targe, polypeptide. For sample, i„ the case of mammalia, Scm, an agonist 
could numic the transcriptional negative regulation capability of mammalian Scm An 
agotas. may be, for example a smaJl molecule, peptide, peptoid, polypeptide, or 

5 polynucleotide. 

The tern "pharmaceutical composition" refers to a composition for 
admiration of a therapeutic agent, such as antibodies or a polypeptide or 
mbbiiors or genes and oth* therapeutic agents listed herein, * ww , and refers to any 
^ phan.ace.tical cier that does not itself induce the production of ^bodies harmrul 
10 to the individual reviving the composition, and which may be administered without 
. - urduc toxicity . : 

T*c term -^effective amount as used herein refers to an amount that is 
•Ac*, to MutWdeted effec. Where the effect is a thentpentic effect me 

.5 tcmo, .egressio^^^^ or , ^ ^ ^ 

= of cancer that i„cii>^ a Srco^tioo dr erowth doming of canc^ ceil s; Wherethe 
« • '"^ti^«iVibre*a^ of mammalian Scm, me effective 

■ » .rpo, ^ ixBcia sdected forftaerrnmmg effectiveness, and depends upon the effect 

An administration of a therapeutic agent of the invention includes 
admu.strationofamerapeuticaUy effective amount of the agent of the invention. Tne 
term 'merapeutically effective amount" as used herein refers to an amount of a 

25 therapeutic agent to treat or prevent a condition treatable by administration of a 
composition of the invention, ; That amount is the amount sufficient to exhibit a 
detectable therapeutic or preventative or ameliorative effect. The effect may include 
for example, treatment or prevention of the conditions listed herein. The precise 
effect amount for a subject will depend upon the subject's size and health the 

30 nature and extent of the condition being treated, recommendations of the treating 
phyaoan, and the therapeutics or combination of therapeutics selected for 
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administration. Thus, it is not useful to.specify an exact effective amount in advance. 
However, the effective amount for a given situation can be determined by routine 
experimentation. Administration can include administration of a polypeptide, and 
causing the polypeptide to be expressed in an animal by administration of the 
5 polynucleotide encoding the polypeptide. ,. 

A 'recombinant vector' herein refers to any vector for transfer or expression of 
the polynucleotides herein in a cell, including, for example,, viral vectors, non-viral 
vectors, plasmid vectors and vectors derived, from the; regulatory sequences of 
heterologous hosts and expression systems. 
10 The term 'in vivo adiniiasii|Btipn f " refers to adnjinistratJ jh to a mammal of a 

polynucleotide encoding a .polypeptide for expressjoji in the mammal. In particular, 
direct in vivo adnunistratiqn. involves i^sfecting,a. mammal's cell with a coding 
sequence without removing the cell from,,the ; nian>n^ on Thus, direc: in vivo 
administration may include chrecj injection of ,%B^Ajmcoding Lhe polypeptide of 
15 interest in the region afflicted by the malignancy or prqBfer* live disorder, resulting in 
expression in the mammal's cells. 

The term "ex vivo admnustration" refer? to irptisfecting a cell, for example, a 
cell from a population of cells that are indignant or proliferating, after the cell is 
removed from the mammal. After transection the cell is then replaced in the 
20 mammal. £r vivo administration can be accomplished l .by removing cells from a 
mammal, optionally selecting for cells to transform, (i.e. cells that are malignant or 
proliferating) rendering the selected cells incapable of replication, transforming the 
selected cells with a polynucleotide encoding a gene for expression, (i.c mammalian 
San), including also a regulatory region for facilitating the expression, and placing the 
25 transformed cells back into the mammal for expression of the mammalian Son. 

"Biologically active" refers to a molecule that retains a specific activity. A 
biologically active mammalian Scm polypeptide, for example, retains the activity 
including for example the control of a homeotic gene or group of homeotic genes. 

"Mammalian cell" as used herein refers to a subset of eukaryotic cells useful in 
30 the invention as host cells, and includes human cells, and animal cells such as those 
from dogs, cats, cattle, horses, rabbits, mice, goats, pigs, etc. The cells used can be 
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genetically unaltered or can be genetically altered, for example, by transformation 
with appropriate expression vectors, marker genes, and the like. Mammalian cells 
suitable for the method of the invention are any mammalian cell capable of expressing 
the genes of interest, or any mammalian cells that can express a cDNA library, C RNA 
5 library, genomic DNA library or any protein or polypeptide useful in the method of 
the invention. Mammalian cells also include cells from cell lines such as those 
immortalized cell lines available from the American Type Culture Collection (ATCC). 
-Spch-ydl lines include, /or example, rat pheochromocytoma cells (PC12 cells), 
embryonal carcinoma cells (P19 cells), Chinese hamster ovary (CHO) cells, HeLa 
10. cells, .buoy .hamster kidney (BKK) cells, monkey kidney cells (COS), human 

hepatocellular carcinoma cells (e.g., Hep G2), human embryonic kidney cells, mouse 
: -.aertbli celh t canine kidney cells, buffalo rat liver cells, human lung cells, human liver 
cdls, mouse Mammary tumor ceils, as well as others. Also included are hematopoetic 
•stem cells, neuronal stem ceils such as neuronal sphere cells, and embryonic stem 
15 : cells <ES sells)/.' 



The present invention will now be illustrated by reference to the following 
ex^pjes-wmeh set fortli par^ However, it should 

be noted that these embodiments are iUustrative and are not to be construed as 
20 ...restricting the invention in any way. 
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ExamptaJL 

A small molecule modulator of mammalian Scm is identified and incorporated 
into a pharmaceutical composition including a liposomal-based pharmaceutical^ 
acceptable carrier for administration to a cancer patient for controlling the expression 
or activity of mammalian Scm in the patient. Administration the composition is 
achieved by injection into the tumor tissue. The patient is monitored for reduction of 
mammalian Scm activity as a diagnostic marker evaluating the effectiveness of the 
treatment. 

A population of progenitor cells are treated with a functional portion of 
recombinant mammalian Scm polypeptide and induced io ddfi ..tntMfe, The process is 
reversed by administering to the popular of. calls ?n inhibfcorvof mammalian Scm 
activity. 

Example^ 

Northern blots of mRNA isolated from various, tissu;s were probed with 
mammalian San cDNA for an analysis of the expression differential of mammalian 
Scm in normal and cancerous tissues, using standard techniques for accomplishing the 
hybridizations. The normal tissues probed were human adult heart, skeletal muscle, 
pancreas, prostate, testes, ovary, colon, thymus, brain, placenta, lung, liver, kidney, 
20 peripheral leukocytes, and spleen. The tissue specific expression of mammaUan Scm 
in normal human adult tissue indicated abundant mammalian Scm transcript in human 
heart, skeletal muscle, pancreas, and testes. A somewhat less abundant amount of 
transcript was present in human prostate, ovary, colon, thymus, brain, placenta, lung, 
liver, and kidney, and the transcript was virtually undetectable in human leukocytes, 
25 and undetectable in the human spleen tissue probed. 

By contrast, mammalian Scm transcripts were present at an abundantly high 
level in the following human cancer cell lines: promyelocyte leukemia HL-60, HeLa 
cell S3, chronic myelogenous leukemia K-562, lymphoblastic leukemia MOLT-4, 
Buriritt's lymphoma Raji, colorectal adenomcarcinoma SW480, lung carcinoma A549, 
30 and melanoma G361. In addition, Scm transcript was also abundantly high in lung 
carcinoma tissue, colorectal adenocarcinoma tissue, and lymphocytic cancer tissues. 
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The mammalian San transcript was approximately 4 to 4.2 kilobases in size for all 
hybridizations. Hybridizations were conducted using stringent conditions and a 
standard hybridization protocol for accomplishing Northern blot hybridizations. 

Transcript levels were controlled for by probing with actin probe on the same 
5 blots probed with mammalian San coding sequence. 

The description of the invention draws on previously published work and, at 
times, on pending patent applications. By way of example, such work consists of 
scientific papers, abstracts, or issued patents, and published patent applications. All 
published work cited herein <ue hereby incorporated by reference. v 
10 The following Sequences are described below: 

SEQ IT ,05: 1, 3, and 5 aie human cDNA sequences for Scm isoforms 

SI?Q ID ITOS: 2, 4, and 6 arfc translated human amino acid sequences for the Scm 

isoforms 

SEQ ID NO: 7 is the mouse eCNA for Scm 
1 5 SEQ ID K3: 8 is 4-ie translated mouse amino acid sequence for Scm 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Randazzo, Filippo 

(ii) TITLE OF INVENTION: Manrnalian Sex Comb on Midleg Acts aa « 

Tumor Suppressor 

(iii) NUMBER OF SEQUENCES: 8 

(iv) CORRESPONDENCE ADDRESS: 

: (A) ADDRESSEE: Chiron Corporation 

(B) STREET: 4560 Horton street 
: ' (C) "clTY^: Emeryville 
(D) STATE: California 
- v, (E<) COUNTRY: U.S.A. ^ 
(F) ZIP: 94608 
• . . «. ■ '* ■• • 

(v) COMPUTER READABLE FORM: 

r " (A) MEDItiM-TYPE: Floppy disk 

(B) COMPUTER: IBM PC, coflnpatible 
: 1 icy OPERATING SYSTEM: ^C-tOg^fis-DOS 

(D) SOFTWARE: Patently Release,. #1 . 0, Version 31 . 30 

(Vi) CURRENT APPLICATION DATA: 



(A ) --APPLI CfttflON jiuMB^gf 

(B) FILING DATE: 
30 (C) CLASS tFIGAtlONr ' : 



i viii ) ATTORNEY /AGENT 'iNFdtffktf I?£ : "* : 

(A) NAME: Guth, Joseph^. 

( B ) REGISTRATION NUSb^R i' J Jj 1 , 2 61 

(C) REFE RENCE/ DOCKET WU^BER 1224. .006 

<ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE : ( 510 ) 92 3 - 3 8 6 8 " 

(B) TELEFAX: (510) 655-3542 

(2) INFORMATION FOR SEQ ID NO: 1 : 

(1) SEQUENCE CHARACTERISTICS :" 

(A) LENGTH: 2855 base pa^irs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



50 (ii) MOLECULE TYPE: DNA (genomic) 



55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

CAAATCATAA TAATGCAGGT CATTTTACCT GGGACAAATA CCTAAAAGAA ACATGTTCAG 60 

TCCCAGCGCC TGTCCATTGC TTCAAGCAGT CCTACfVCACC TCCAAGCAAC GAGTTCAAGA 120 

TCAGTATGAA ATTGGAAGCA CAGGACCCCA GGAAqvCCAC ATCCACCTGT ATTGCCACAG 180 

TAGTTGGACT GACAGGTGCC CGCCTTCGCC TGCGCCTTGA TGGGAGCGAC AACAAAAATG 240 
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ACTTCTGGCG GCTGGTTGAC TCAGCTGAAA TCCAGCCTAT TGGGAACTGT GAAAAGAATG 
GGGGTATGCT ACAGCCACCT CTTGGATTTC GGCTGAATGC GTCTTCTTGG CCCATGTTCC 
5 TTTTGAAGAC GCTAAATGGA GCAGAGATGG CTCCCATCAG GATTTTCCAC AAGGAGCCAC 
CATCGCCTTC CCACAACTTC TTCAAAATGG GAATGAAGCT AGAAGCTGTG GACAGGAAGA 
10 ACCCTCATTT CATTTGCCCA GCCACTATTG GGGAGGTTCG GGGCTCAGAG GTGCTTGTCA 
CTTTTGATGG GTGGCGAGGG GCCTTTGACT ACTGGTGCCG CTTCGACTCC CGAGACATCT 
TCCCTGTGGG CTGGTGTTCC TTGACTGGAG ACAACCTGCA GCCTCCTGGC ACCAAAGTTG 
15 TGATTCCAAA GAATCCCTAT CCTGCCTCCG ATGTGAATAC TGAGAAGCCC AGCATCCACA 
GCAGCACCA\ AACTGTCTTG GAACATCAAC CAGGGCAGAG GGGGCGTAAA CCAGGAAAGA 
20 AGCGGGGCCG ^^CAAG ACCCTAATTT CCCATCCCAT CTCTGCCCCA T C CAAGACAG 
CTGAACCTTT GAAATTCCCA AAGAAGAGAG. ^CCCMACC TGGCAGCAAG AGGAAACCTC 
GGACTTTGCT GAACr- ^CA CCTGCCTCAC CA^^CCAG CACTCCTGAA CCGGATACCA 
25 * GCACTGTACO CCAGGATGCT GCCACCATCC CCAGCTCAGC CATGCAGGCC CCAACAGTTT 
GTATCTACTT GAACAAGAAT GGCAC^ ^CCCC^CTT AQA^GAAG AAGGTCCAGC 
AACTCCCTGA CCATTTTGGA CCAGCCCGTG* CCTCTGTGfGT GT7GCAGCAG GCTGTCCAGG 
CCTGTATCGA c! — OTT r T; * 7AC*7*AA CCGTCTTCAG CTTCCTCAAG CAAGGCCATG 
GTGGTGAGGT TATCTCAGCC GTGTT^ACC GGGAACASGA TACCCTCAAC CTCCCAGCAG 
TCAACAGCAT (^ACCTACGTC CTCCGCTTCC TGGAGAAACT CTGCCACAAC CTTCGTAGTG 
ACAATCTGTT TGGCAACCAG ,CCCTTTACAC AGACTCAOTT GTCACTCACT GCCATAGAGT 
ACAGCCACAG CCACGACAGG TACCTACCAG GTGAAACCTT TGTCCTGGGG AATAGTCTGG 
CCCGCTCCTT GGAACCACAC TCAGACTCAA TGGACTCTGC CTCAAATCCC ACCAACCTTG 
TCAGCACCTC CCAAAGGCAC CGGCCCTTGC TTTCATCCTG TGGCCTCCCA CCAAGCACTG 
CCTCAGCTGT GCGCAGGCTA TGCTCCAGGG GGTCGGACCG ATACCTGGAG AGCCGCGATG 
CCTCTCGACT GAGTGGCCGG GACCCCTCCT CGTGGACAGT CGAGGATGTG ATGCAGTTTG 
TCCGGGAAGC TGATCCTCAG CTTGGACCCC ACGCTGACCT GTTTCGCAAA CACGAGATCG 
ATGGCAAGGC CCTGCTGCTG CTGCGCAGTG ACATGATGAT GAAGTACATG GGCCTGAAGC 
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60 



TGGGGCCTGC 
CCAGGAGAGG 
GGGGCATCAG 
GGCTGTGTGG 
TGAGGAGGGA 
CTAAGGTCCC 



ACTCAAGCTC 
CAGCCTAGAC 
CCCACCCCAG 
AGCCACCACT 
GAGTGGGGGT 
TCTATTTATT 



TCCTACCACA TTGACCGGCT GAAGCAGGGC AAGTTCTGAA 
AACCAAGTGG CAGCAGGTGG GGGCATTCTT CTAAGAATGA 
GCACCTCAGT GGGGTTCCGG GCCACCTCAG GACTCCAAGA 
CCTAGCCACA GCTGCCATGA TAAGTCCTTC CATGAAGGAC 
CCAGGGCTGG TGCTGCTCTT CCCTCAGCTC TGCCGGGGCT 
TCTCAACCCT GGCTGGCCTC TCACCAGGAG TTTAGGCTGA 



300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



35 
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10 



ATGCCTTCCA CGTGATGGAG GAAAAGGCCA ACTCTGTCCT GGTCTTGCTG TGGCACCCCA 
TCGCCCCACA GCTCGTACCT TCTCACCAGA TTCCCCTGAA TCCAAACTCG TGGTGCAAAC 
5 CTCTACCTTT TTTACAAAAA GATCTTATTG TTAATTTATT GTTTCTGGCA CTTGGGCAAA 
CCCTGTAGTT AATACTCCTC CCACACTAGA CACTGGGTTT CAGGAGGAGG GAGACTGCCC 
TGCTTTGGTC CCAGAGAGGC CCTCTGCAGA TAGGCGTGGC CCCTCTTCAG AGGACACTAC 
CCTAGGGCAC TTTCTCTTTG AGGTGGAGAG ACCCATAAAG CCTTGACCAC ATCACTCCAT 
ATGGGGAGGA GAAGGATCCC TGTCACCTTC TCCTCTCTTC ACGGGGCCCT TTTGCAGCCC 
15 TAGGCCTCAT CTGTGGGAAG GGAGTCCCTG GCTCATACTG CCCCC^CCAC AGCTCCTTGC 
CCTGGCCAGA ACTGCTGTCG AAGAAAATCA GGCCGGAAGG CCAAGAAGGC GCTAAGGGGG 
ATGGGAGGGC AGGTTTTCCA GGCTGGAGTC GGTTCCACCC ACTCGCCTGT CCACACkiCTT 
CCTTGTAAGC AAGTCAGCAG CACAGCTACT CACGCTGCCA TCTGGACTTA TTTTATGTCA 
ATCTGTTTAT AAATAAAAAC CAATATAGGG AA'iVC 
25 (2) INFORMATION FOR SEQ ID SJO: 2: t ■ 



20 



30 



35 



45 



55 



60 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6?C &e:.ik c.cids 

(B) TYPE: amine acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: lir*ar:-- 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



4® Ile Pro His Asn Asn Ala Gly His Phe Thr Trp Asp Lys Tyr Leu 

1 -5 10 is 



Lys Glu Thr Cys Ser Val« Pro ;Ala Pro Val His Cys Phe Lys Gin Ser 

20 25 30 

Tyr Thr Pro Pro Ser Asn Glu Phe Lys Ile Ser Met Lys Leu Glu Ala 
35 40 45 



Gin Asp Pro Arg Asn Thr 1'hr Ser Thz Cys Ile Ala Thr Val Val Gly 
3U 50 55 60 

Leu Thr Gly Ala Arg Leu Arg Leu Arg Leu Asp Gly Ser Asp Asn Lys 
65 70 75 80 



Asn Asp. Phe Trp Arg Leu Val Asp Ser Ala Glu He Gin Pro Ile Gly 

85 90 95 

Asn Cys Glu Lys Asn Gly Gly Met Leu Gin Pro Pro Leu Gly Phe Arg 

100 105 no 

Leu Asn Ala Ser Ser Trp Pro Met Phe Leu Leu Lys Thr Leu Asn Gly 
115 120 125 



2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2855 



36 
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15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Ala .gu Met Ala Pro He Arg He Phe Hi* Lys Glu 



135 



140 



Ser His Asn Phe Phe Lys Met Gly Met Lys Leu Glu 



150 



155 



Lys Asn Pro His Phe He Cys Pro Al 



Pro Pro Ser Pro 



Ala Val Asp Arg 

160 



165 * ™J ^ ° 1U Val *** G1 V 



175 



Ser Glu Val Leu Val Thr Phe Asp Gly Tr p Arg Gly Ala Phe Asp Tyr 



190 

Trp ,ys Arg Phe. Asp Ser Arg Asp He Phe Pro Val Gly Trp cys Ser 

^ 00 205 



Leu TKr Gly Asp Asn Leu Gin Pro Pro Gly Thr Lys 



215 



220 



Val Val He Pro 



}g Asn. Pro Tyr.Pra Ala Ser Asp Val Asn 



230 235 ^ SCr I1C 

His Ser Ser thr Lys Thr 



240 



?45 



Val Leu Glu His Gin ,Pro Gly sin Arg Gly 



255 

Lys Thr Leu He Ser 
270 

His Pro He Ser Ala • 5ro <Scr Lys Tni 1 Ala Glu t 



Arg Lys Pro Gly Lys Ly* Arg Gly Arg Thr Pro 

260 265 



Lys Lys Arg Gly Pro Lys .Pro Gly Ser Lys Arg Lys 



295 



300 



285 



Pro Arg Thr Leu 



Leu Asn Pro Pro Pro Ala Ser Pro 



305 



310 



Thr Thr Ser Thr Pro Glu Pro Asp 



315 



Thr ser Thr Val Pro Gin Asp Ala Ala Thr He Pro Ser Ser Ala 

C' J « 330 ,« 



320 
Met 



335 



Gin Ala. Pro Thr Val cvs n#> Tvr t.» * » 

340 7 7 iTl'*** LyS Asn G ^ Ser T «r Gly 

345 350 

Pro His Leu Asp. Lys Lys Ly., Val Gin Gin Leu Pro Af? His Phe Gly 

" 360 365 



Pro Ala Arg Ala Ser ial Val Leu Gin Gin 



375 



Ala Val Gin Ala Cys He 



380 



Asp Cys Ala Tyr His Gin Lys Thr Val Phe Ser 



390 



395 



Phe Leu Lys Gin Gly 



His Gly Gly Glu. Val He Ser Ala Val Phe 



400 



405 



410 



Asp Arg Glu Gin His Thr 



Leu Asn Leu Pro Ala Val Asn Ser He 



415 



420 



425 



Thr Tyr Val Leu Arg Phe Leu 



430 



Glu Lys jeu.cys His Asn Leu Arg ser Asp Asn Leu Phe Gly ^ Gln 

qqo 445 

Pro jhe Thr Gin Thr His Leu S,r Leu Thr Ala He Glu Tyr Ser His 

«m 460 



37 
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10 



20 



25 



Ser His Asp Arg Tyr Leu Pro Gly Glu Thr Phe Val Leu Gly Asn Ser 

475 480 
Leu Ala Arg Ser Leu Glu Pro His Ser Asp Ser Met Asp Ser Al. Ser 



485 4 on 

490 495 



Asn Pro Thr Asn Leu Val Ser Thr Ser Gin Arg His Arg Pro Leu ^ 



510 



Ser ser Cys Gly Leu Pro Pro Ser Thr Ala ser Ala Val Arg Arg Leu 

5 2 o 

Cys ser Arg Gly Ser Asp Arg Tyr Leu Glu Ser Arg. Asp Ala Ser Arg 
15 535 540 * 

L*u Ser Gly Arg Asp Pro Ser Ser Trp Thr Val Glu Asp Val Met Gin 

■ . .. ..• .... 550 . . j -.;,-.S5 5'. 560 

Phe Val Arg, Glu Ala . Asp Pro. Gl n >en..Gly .Pro: H i» Ala Asp Leu Phe 

565 570 £75 



Arg XfrWjfc lie Asp Gly Lys Ma' Leu Leu Leu' Leu Arg ser Asp 
Met Met Met Lys Tyr ; .Met,Gly,Leu Lys: Leu .Gly^ Pro Ala^eu Lys Leu 



610 LyS GlXl Gly LyS Phe 

(2) INFORMATION. FOR SEQ ID 1^0:3;. : ?r -, t/*:;;v. r.C 1 :: *: 

(i) SEQUENCE CHARACTERISTICS : r 
~ K *A> LENCTtt: 3327 base pairs 

J:> (B) TYPE: .nuclei^ aoi4 r 

(C) STRAI^EdNESS : single 

(D) TOPqLOGY ; : ^linear 

(ii) MOLECULE TYPE: DNA (genomic) 



" - •:?;:.-.,-v:;. r 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GCGGAAACAT GGCGGCGGGA AGGGAGTGAG CCGCCCCGCG CCCCCGCCGC GCCCTCAGAT 
GGAGAAATtA GCATACAAAG AAACTGACTT GTCAGAAGTC AGAGCAAGGT ATTGGTGGAT 
50 CCAGGGATAA ATCCCAAACT TCTTAACCCC TAGACCGGTT TTTAGTCCAT TGACTATGCA 
GCCTAATGTG ATAGACTGGA GTGATGTTAG AAAACACAAA TATGGTCACC TATCAGAGTC 

•** j*'*' f '" 

^ TGCATCCCAA TATCAAGAAG CTGCTGACAT CCTGGATCTA GGGTTGTAAA GAAGATTACA 
TGAGCTAATG GATGTGAAAA CATCTTAAAA ACTCTCAAAT ACTTTTCAAC TTTGGAGGAT 
TATTATGATT TTCATTCTGT TCAGCGGCTA TACTCAGACT TTACTCTAAA AGTCAAATCT 

60 TCTGACATTC TTTGAAGTGA AGCATTCTAT GAATGTGAGC TGAAGAAATG AATGAAATGA 
AATAATGCAG GTCATTTTAC CTGGGACAAA TACCTAAAAG AAACATGTTC AGTCCCAGCG 



60 
120 
180 
240 
300 
360 
420 
480 
540 



38 
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CCTGTCCATT GCTTCAACCA GTCCTACACA CCTCCAAGCA ACGAGTTCAA GATCAGTATG 600 

AAATTGGAAG CACAGGACCC CAGGAACACC ACATCCACCT GTATTGCCAC AGTAGTTGGA 660 

5 CTGACAGGTG CCCGCCTTCG CCTGCGCCTT GATGGGAGCG ACAACAAAAA TGACTTCTGG 720 

CGGCTGGTTG ACTCAGCTGA AATCCAGCCT ATTGGGAACT GTGAAAAGAA TGGGGGTATG 780 

^ CTACAGCCAC CTCTTGGATT TCGGCTGAAT GCGTCTTCTT GGCCCATGTT CCTTTTGAAG 840 

ACGCTAAATG GAGCAGAGAT GGCTCCCATC AGGATTTTCC ACAAGGAGCC ACCATCGCCT 900 

TCCCACAACT TCTTCAAAAT GGGAATGAAG CTAGAAGCTG TGGACAGGAA GAACCCTCAT 960 

15 TTCATTTGCC CAGCCACTAT TGGGGAGGTT CGGGGCTCAG AGGTGCTTGT CACTTTTGAT 1020 

GGGTGGCGAG GGGCCTTTGA CTACTGGTGC CGCTTCGACT CCCGAGACAT CTTCCCTGTG 1080 

, GGCTGGTCTT CCTTGACVGG AGACAACCTG CAGCCTCCTG GCACCAAAGT TGTGATTCCA 1140 

y AAGAATCCCT ATCCTGCCTC CGATGTGAAT ACTGAGAAGC CCAGCATCCA CAGCAGCACC 1200 

AAAACTGTCT TGGAACATC7V ACCAGGGCAG AGGGGGCGTA AACCAGGAAA GAAGCGGGGC 1260 

25 CGGf\CACXCA AGACCCTAA7 TTCCCATCCC ATCTCTGCCC CATCCAAGAC AGCTGAACCT 1320 

TTGAAATTCC CAAAGAAG^J AGGTCCCAAA CCTGGCAGCA AGAGGAAACC TCGGACTTTG 1380 

CTGAACJCAC CACCTGCCTC ACCAACAACC AGCACTCCTG AACCGGATAC CAGCACTGTA 1440 

30 

CCCCAGGATG CTGCCACCAT CCCCAGCTCA GCCATGCAGG CCCCAACAGT TTGTATCTAC 1500 

TTGAACAAGA. ATGGCAGCAC AGGCCCCCAC ITAGATAAGA AGAAGGTCCA GCAACTCCCT 1560 

35 GACCATTTTG GACCAGCCCG TGCCTCTGTG GTGTTGCAGC AGGCTGTCCA GGCCTGTATC 1620 

GACTGTGCTT ATCACCAGAA AACCGTCTTC AGCTTCCTCA AGCAAGGCCA TGGTGGTGAG 1680 

GTTATCTCAG CCGTGTTTGA CCGGGAACAG CATACCCTCA ACCTCCCAGC AGTCAACAGC 1740 

40 

ATCACCTACG TCCTCCGCTT CCTGGAGAAA CTCTGCCACA ACCTTCGTAG TGACAATCTG 1800 

TTTGGCAACC AGCCCTTTAC ACAGACTCAC TTGTCACTCA CTGCCATAGA GTACAGCCAC I860 

45 AGCCACGACA GGTACCTACC AGGTGAAACC TTTGTCCTGG GGAATAGTCT GGCCCGCTCC 1920 

TTGGAACCAC ACT CAGACTC AATGGACTCT GCCTCAAATC CCACCAACCT TGTCAGCACC 1980 

TCCCAAAGGC ACCGGCCCTT GCTTTCATCC TGTGGCCTCC CACCAAGCAC TGCCTCAGCT 2040 

; 50 . ■ i • : 

GTGCGCAGGC TATGCTCCAG GGGGTCGGAC CGATACCTGG AGAGCCGCGA TGCCTCTCGA 2100 

CTGAGTGGCC GGGACCCCTC CTCGTGGACA GTCGAGGATG TGATGCAGTT TGTCCGGGAA 2160 

55 GCTGATCCTC AGCTTGGACC CCACGCTGAC CTGTTTCGCA AACACGAGAT CGATGGCAAG 2220 

GCCCTGCTGC TGCTGCGCAG TGACATGATG ATGAAGTACA TGGGCCTGAA GCTGGGGCCT 2280 

GCACTCAAGC TCTCCTACCA CATTGACCGG CTGAAGCAGG GCAAGTT.CTG AACCAGGAGA 2340 

oO 

GGCAGCCTAG ACAACCAAGT GGCAGCAGGT GGGGGCATTC TTCTAAGAAT GAGGGGCATC 2400 

AGCCCACCCC AGGCACCTCA GTGGGGTTCC GGGCCACCTC AGGACTCCAA GAGGCTGTGT 2460 

39 



WO 97/4221 1 



PCT/US97/07575 



10 



GGAGCCACCA CTCCTAGCCA CAGCTGCCAT GATAAGTCCT TCCATGAAGG ACTGAGGAGG 
GAGAGTGGGG GTCCAGGGCT GGTGCTGCTC TTCCCTCAGC TCTGCCGGGG CTCTAAGGTC 
CCTCTATTTA TTTCTCAACC CTGGCTGGCC TCTCACCAGG AGTTTAGGCT GAATGCCTTC 
CACGTGATGG AGGAAAAGGC CAACTCTGTC CTGGTCTTGC TGTGGCACCC CATCGCCCCA 
CAGCTCGTAC CTTCTCACCA GATTCCCCTG AATCCAAACT CGTGGTGCAA ACCTCTACCT 
TTTTTACAAA AAGATCTTAT TGTTAATTTA TTGTTTCTGG CACTTGGGCA AACCCTGTAG 
TTAATACTCC TCCCACACTA GACACTGGGT TTCAGGAGGA GGGAGACTGC CCTGCTTTGG 
15 TCCCAGAGAG GCCCTCTGCA GATAGGCGTG GCCCCTCTTC AGAGGACACT ACCCTAGGGC 
ACTTTCTCTT TGAGGTGSAG AGACCCATAA AGCCTTGACC ACATCACTCC ATATGGGGAG 
2Q GAGAAGGATC CCTGTCACCT TCTCCTCTCT TCACSGGGCC CVTTTCCAGC CCTAGGCCTC 
A'ICTGTGGGA AGGGAGTCCC TGGCTCATAC TGCCCCCACC ACAGCTCCTT GCCCTGGCCA 
GA.1CTGCTGT CGAAGAAAAT CAGGCCGGAA GGCCAAGAAG GCGCTAAGGG GGATGGGAGG 
25 GCAGGTTTTC CAGGCTGGAG TCGGTTCCAC CCACTCGCCT GXCCACAGGC TTCC7TGTAA 
GCAAGTCAGC AGCACAGCTA CTCACGCTGC CATCTGGACT TATTTTATGT CAATCTGTTT 
ATAAATAAAA ACCAATATAG GGAATTC 
(2) INFORMATION FOR SEQ ID NO: 4: 

» ♦ J 

(i) SEQUENCE CHARACTERISTICS: 
<a*r < A > LENGTH: 577 . amino acids 

JJ (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY :. linear 



40 



50 



55 



(ii) MOLECULE TYPE: protein 



' (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Lys Leu. Glu Ala Gin Asp Pro Arg Asn Thr Thr Ser Thr Cys lie 



15 



Ala Thr Val Val Gly Leu Thr Gly Ala Arg Leu Arg Leu Arg Leu Asp 

25 30 

Gly Ser Asp Asn Lys Asn Asp Phe Trp Arg Leu Val Asp ser Ala Glu 



40 45 



He Gin Pro lie Gly Asn Cys Glu Lys Asn Gly Gly Met Leu Gin Pro 



55 60 



Pro Leu Gly Phe Arg Leu Asn Ala Ser Ser Trp Pro Met Phe Leu Leu 
60 70 75 80 

Lys Thr Leu Asn Gly Ala Glu. Met Ala Pro He Arg He Phe His Lys 

85 90 95 y 



40 



2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3X20 

3180 

3240 

3300 

3327 
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25 



40 



55 



Glu Pro Pro Ser Pro Ser His Asn Phe Phe Lys Met Gly „ et Lys Leu 



105 



110 



Glu Ala Val Asp Arg Lys Asn Pro His Phe lie Cys Pro Ala Thr lie 

120 125 



Gly go v.i Arg Gly ser Glu Val Leu v.l Thr Phe Asp Gly Trp Arg 



140 



«y Al. Phe Asp Tyr j g cys ^ phe ^ ^ ^ 



155 160 



Val Gly Trp cys Ser Leu Thr Gly Asp Asn Leu Gin Pro Pro Gly Thr 

15 ; ... - " 170 175 

Ly - Val Val Pro *» »«» Tyf Pr ° sei *-P val Asn Thr 



185 



190 



; ^ Glu Lys Pro Ser lie His Ser Ser Thr Lys Thr Val Leu Glu His Gin 



200 



205 



Pro Giy sin Arg Gly Arg Lys Pro Gly Lys Lys Arg. Gly Arg Thr 

^13 O'jft 



Pro 

220 



-ya-Tftr Leu- lie- ser His Pro lie Ser Ala Pm ^ t 
225 230 ys Thr 7113 Glu 



235 



Pro Leu Lys Phe Pg Lys L Y s Arg. Gly Pro Lys Pro Gly Ser Lys Z 

30 250 255 

Lys Pro Arg Thr Leu Leu Asn p™ o*-* d , - , 

260 r Pro Thr Thr s « 

265 270 

35 ^ S5 ASP ThiC I hr Val P " «» Asp Ala Ala Thr He 



280 



285 

Pro ser Ser Ala Met Glu Ala Pro Thr Val Cys He Tyr Leu Asn Lys 

« 5 300 



Asn Gly ser Thr Gly Pro His Leu Asp Lys Lys Lys Val Gin Gin Leu 

JlU 11 ? 



315 



320 



Pro Asp His Phe Gly Pro Al. Arg Ala Ser Val Val Leu Gin Gin Ala 
45 " 330 335 

VW Gin Ala Cys He Asp Cys Ala Tyr His Gin Lys Thr Val Phe Ser 



345 



350 



50 LCU Hi GXn Glv His fiy Glu. val lie ser Ala Val Phe Asp 

360 355 



Arg Glu Gin His Thr Leu Asn Leu Pro Ala Val Asn Ser He Thr Tyr 



380 



Vjl Leu Arg Pha Leu Glu Lys Leu Cys His Asn Leu Arg Ser Asp Asn 

395 400 



Leu Phe Gly Asn Gin Pro Phe Thr Gin Thr His Leu Ser Leu Thr Ala 

He Glu Tyr Ser His Ser His Asp Arg Tyr Leu Pro Gly Glu Thr Phe 

425 430 



41 



1 
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15 



Val Leu Gly Asn Ser Leu Ala Arg Ser Leu Glu Pro His Ser Asd Ser 
435 440 445 

Met Asp Ser Ala Ser Aan Pro Thr Asn Leu Val Ser Thr Ser Gin Aro 
450 455 460 * 

His Arg Pro Leu Leu Ser Ser Cys Gly Leu Pro Pro Ser Thr Ala Ser 
465 470 475 480 

Ala Val Arg Arg Leu cys Ser Arg Gly Ser Asp Arg Tyr Leu Glu Ser 

485 490 495 

Arg Asp Ala Ser Arg Leu Ser Gly Arg Asp Pro Ser Ser Trp Thr Val 

500 j>05 510 

Glu Asp Val Met Gin Phe Val Arg Glu Ala Asp Pro Gin Leu Gly Pro 
515 520 525 



\ 



His Ala Asp Leu Phe Arg Lys His Glu lie Asp Gly Lys Ala Leu Leu 

A* 530 535 v :.:r;p' -■>...* 



t*u Leu Arg Ser Asp M^t Met Met Lys Tyr Met "Gly Leu Lys Leu Glv 
545 550 555 56 5 



25 



Pro Ala Leu Lys Leu Ser Tyr His He Asp Arg Leu Lys Gin Glv Lvs 



30 



(2) INFORMATION FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS: 

- t <A) LENGTH: ■ 325S.Jb«sML- paliaTCwV-.-'; . : ■ . 

3 5 (B) TYPE: nucleic acid 

(C) strandednsss? .single^ ;*<v.y*. 7'- r- i- : v 

(D) TOPOLOGY: linear 



4 J , ^ - H 



40. . 



(ii) MOLECULE TYPE: DNA (genomic) 



45 



(xi) SEQUENCE DESCRIPTION: SEQ 'ID NO:5: 





CGGAAACATG 

t 


GCGGCGGGAA 


GGOAGTGAGC 


CGCCCGGCGC CCCCGCCGCG CCCTCAGATG 


60 




GAGAAATTAG 


CATACAAAGA 


AACTGACTTG 


TCAGAAdT'CA GAGCAAGGTA TTGGTGGATC 


120 


50 


CAGGGATAAA 


TCCCAAACTT 


CTTAACCCCT 


AGACCGGTTT TTAGTCCATT GACTATGCAG 


180 




CCTAATGTGA 


TAGACTGGAG 


TGATGTTAGA 


AAACACAAAT XTGGTCACCT ATCAGAGTCT 


240 


55 


GCATCCCAAT 


ATCAAGAAGC 


TGCTGACATC 


CTGGATCTAG GGTTGTAAAG AAGATTACAT 


300 


GAGCTAATGG 


ATGTGAAAAC 


ATCTTAAAAA 


CTCTCAAATA CTTTTCiAACT TrGGAGGATT 


360 




ATTATGATTT 


TCATTCTGTT 


CAGCGGCCAT 


ACTCAGACTT TACTCTAAAA GTCAAATCTT 


420 


60 


CTGACATTCT 


TTGAAGTGAA 


GCATTCTATG 


AATGTGAGCT c GAAGAAATGA ATGAAATGAA 


480 




ATAATGCAGT 


CCTACACACC 


TCCAAGCAAC 


GAGTTCAAGA TCAGTATGAA ATTGGAAGCA 


540 
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CAGGACCCCA GGAACACCAC ATCCACCTGT ATTGCCACAG TAGTTGGACT GACAGGTGCC 
CGCCTTCGCC TGCGCCTTGA TGGGAGCGAC AACAAAAATG ACTTCTGGCG GCTGGTTGAC 
5 TCAGCTGAAA TCCAGCCTAT TGGGAACTGT GAAAAGAATG GGGGTATGCT ACAGCCACCT 
CTTGGATTTC GGCTGAATGC GTCTTCTTGG CCCATGTTCC TTTTGAAGAC GCTAAATGGA 
1Q GCAGAGATGG CTCCCATCAG GATTTTCCAC AAGGAGCCAC CATCGCCTTC CCACAACTTC 
TTCAAAATGG GAATGAAGCT AGAAGCTGTG GACAGGAAGA ACCCTCATTT CATTTGCCCA 
GCCACTATTG GGGAGGTTCG GGGCTCAGAG GTGCTTGTCA CTTTTGATGG GTGGCSAGGG 
GCCTTTGACT ACTGGTGCCG CTTCGACTCC CGAGACATCT TCCCTG1GGG CTGGTGTTCC 
TTGACTGGAG ACAACCTGCA GCCTCCTGGC ACCAAAGTTG TGATTCCAAA CAATCCCTAT 
2Q ceTGCCTCCG ATGTGAATAC TGAGAAGCCC AGCAT CCACA GCAGCACCAA AACTGTCTTG 
GAACATCAAC CAGGGCACVJ GGGGCGTAAA CCAGGAAAGA AGCGGGGCCG GACACCCAAG 
ACCCTAATTT CCCATCCCAT CTCTGCCCCA TCCAAGACAG CTGAACCTTT GAAATTCCCA 
25 AAGAAGAGAG GTCCCAAACC TGGCAGCAAG AGGAAACCTC GGACTTTGCT GAACCCACCA 
CCTGCCTCAC CAACAACCAG CACTCCTGAA CCGGATACCA GCACTGTACC CCAGGATGCT 
3Q GCCACOlTCC CCAGCTCAGC CATGCAGGCC CCAACAGTTT GTATCTACTT GAACAAGAAT 
GGCAGCACAG GCCCCCACTT AGATAAGAAG AAGGTCCAGC AACTCCCTGA CCATTTTGGA 
CCAGCCCGTG CCTCTGTGGT. GT7GC. ^as GCTGTCCAGG • CCTGTATCGA CTGTGCTTAT 
35 CACCAGAAAA CCGTCTTCAG CTTCCTCAAG CAAGGCCATG GTGGTGAGGT TATCTCAGCC 
GTGTTTGACC GGGAACAGCA TACCCTCAAC CTCCCAGCAG TCAACAGCAT CACCTACGTC 
40 CTCCGCTTCC TGGAGAAACT CTGCCACAAC CTTCGTAGTG ACAATCTGTT TGGCAACCAG 
CCCTTTACAC AGACTCACTT GTCACTCACT GCCATAGAGT ACAGCCACAG CCACGACAGG 
TACCTACCAG GTGAAACCTT TGTCCTGGGG AATAGTCTGG CCCGCTCCTT GGAACCACAC 
45 ; TCAGACTCAA TGGACTCTGC CTCAAATCCC ACCAACCTTG TCAGCACCTC CCAAAGGCAC 
CGGCCCTTGC • TTTCATCCTG TGGCCTCCCA CGAAGCACTG CCTCAGCTGT GCGCAGGCTA 
50 TGCTCCAGGG GGTCGGACCG ATACCTGGAG AGCCGCGATG CCTCTCGACT GAGTGGCCGG 
GACCCCTCCT CCTGGACAGT CGAGGATGTG ATGCAGTTTG TCCGGGAAGC TGATCCTCAG 
CTTGGACCCC ACGCTGACCT GTTTCGCAAA CACGAGATCG ATGGCAAGGC CCTGCTGCTG 
55 CTGCGCAGTG ACATGATGAT GAAGTACATG GGCCTGAAGC TGGGGCCTGC ACTCAAGCTC 
... TCCTACCACA TTGACCGGCT GAAGCAGGGC AAGTTCTGAA CCAGGAGAGG CAGCCTAGAC 
60 AACCAAGTGG GGGCATTCTT CTAAGAATGA GGGGCATCAG CCCACCCCAG 

GCACCTCAGT GGGGTTCCGG GCCACCTCAG GACTCCAAGA GGCTGTGTGG AGCCACCACT 
CCTAGCCACA GCTGCCATGA TAAGTCCTTC CATGAAGGAC TGAGGAGGGA GAGTGGGGGT 



600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
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40 



2520 
2580 
2640 
2700 
2760 
2820 



CCAGGGCTGG TGCTGCTCTT CCCTCAGCTC TGCCQGGGCT CTAAGGTCCC TCTATTTATT 
TCTCAACCCT GGCTGGCCTC TCACCAGGAG TTTAGGCTGA ATGCCTTCCA CGTGATGGAG 
GAAAAGGCCA ACTCTGTCCT GGTCTTGCTG TGGCACbCGA^ TCGCCCCACA GCTCGTACCT 
TCTCACCAGA TTCCCCTGAA TCCAAACTCG TGCTGCAAAC ,CTCTACCTTT TTTACAAAAA 
GATCTTATTG TTAATTTATT GTTTCTGGCA CTTGGGCAAA CCCTGTAGTT AATACTCCTC 
CCACACTAGA : CACTGGGTTT CAGGAGGAGG GAGACTbccb TGCTTTGGTC CCAGAGAGGC 
CCTCTGCAGA TAGGCGTGGC CCCTCTTCAG A^CACTAC CCTAG^GCAC TTTCTCTTTG 2880 
15 AGGTGGAGAG ACCCATAAAG CCTTGACCAC ATCAC^ctAT ATGGGGAGGA GAAGGKTCCC 
1GTCACCTTCTCCTCTCTTC ACpGGGCCCT " T^dCAGCcb;! kcbCCTCAT CTGTGGGAAG 
2q GGAGTCCCTG GCTCATAC^G WCCWC^^ ACTGCTGTCG 3060 

AAGAAAATCA GGCCGGAAGG CCAAGAAGGC GCTAAGGGGG ATGGGAGGGC AGGTTTTCCA 
CrGCTGGAGTC v GGTTCCAOCC AGTCGCCTj^T CCACAGGdTTT fcc^GTAAGC AAGTCAGCAG 
25 CACAGCTACT CACGCTGCCA TCTGGACT^A.. TTTTA^GTCA , ^TCTGTTTAT AAATAflAAAC 3240 
CAATATAGGG AATTC ' " 

■• (2) * INFORMATION FOR SE<J *iD -NO: 6? ' * 

30 

(i) SEQUENCE CHARACTERISTIC^: 

(A) LENGTH : 591 imtnd ^cf ds 
<B) TYPE: amino acid 
<C) STRANDEDNESS : singJLe 
35 (D) TOPOLOGY: linear' * 



2940 
3000 



3120 
3180 



3255 



(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



Met Gin 3er Tyr Thr Pro Pro ser Asn Glu Phe Lys He Ser Met Lys 
45 1 5 10 is 

Leu Glu Ala Gin Asp Pro Arg Asn Thr Thr Ser Thr Cys He Ala Thr 

20 25 30 

50 Val Val Gly Leu Thr Gly Ala Arg Leu Arg Leu Arg Leu Asp Glv Ser 

35 40 45 

Asp Asn Lys Asn Asp Phe Trp Arg Leu Val Asp Ser Ala Glu He Gin 

55 50 55 60 

Pro He Gly Asn Cys Glu Lys Asn Gly Gly Met Leu Gin Pro Pro Leu 
65 70 75 80 

Gly Phe Arg Leu Asn Ala Ser ser Trp Pro Met Phe Leu Leu Lys Thr 
60 85 90 95 

Leu Asn Gly Ala Glu Met Ala Pro He Arg He Phe His Lys Glu Pro 

100 105 no 
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10 



1 • t 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Pro Ser Pro Ser His Asn Phe Phe Lys Met Gly Met Lys Leu Glu Ala 
115 120 125 

Val Asp Arg Lys Asn Pro His Phe lie Cys Pro Ala Thr lie Glv Glu 
130 135 140 

Val Arg Gly Ser Glu Val Leu Val Thr Phe Asp Gly Trp Arg Gly Ala 



150 



155 



160 



Phe Asp Tyr Trp Cys Arg Phe Asp Ser Arg Asp lie Phe Pro Val Glv 

165 170 i 7 5 

Trp Cys Ser Leu Thr Gly Asp Asn Leu Gin Pro Pro Gly Thr Lys Val 

® 0 1 3 >w 



190 



V<l1 Ile ff? Lys Asn Pro ^ Pro Scr A5P Val Asn Thr Glu 



19o 



200 



2Q§ 



Lya 



?i-o sez il fc His Ser Ser Thr Lys Thr Val Leu Glu His Gin Pro Gly 



215 



220 



Gin Arg Gly Arg Lys Pro Gly Lys Lys Arg Gly Arg Thr Pro Lys Thr 

3 ^} ^ ^ ^ 



233 



240 



Leu lie Ser His Pro lie Ser Ala 



„ m Pro s « Lys Thr Ala Glu Pro Leu 

245 250,. 255 

Lys Phe Pro Lys Lys Arg Gly Pro Lys Pro Gly Ser Lys Arg Lys Pro 

260 265 270 



Arg Thr Leu Leu Asn Prp Pro Pre AJ.a Ser Pro Thr Thr Ser Thr Pro 

275 280,. . -_ 285 

Glu Pro Asp Thr Ser WlVa* /Pro Gin Asp Ala. Ala Thr He Pro Ser 
^ u 295 



300 



Ser Ala Met Gin Ala Pro Thr Val Cys lie Tyr 



305 



310 



315 



Leu Asn Lys Asn Gly 



320 



Ser Thr Gly Pro His Leu Asp Lys Lys Lys Val Gin Gin Leu Pro Aso 

325 , ' .330 335 

His Phe Gly Pro Ala Arg Ala Ser Val Val Leu Gin Gin Ala Val Gin 

340 345 350 

Ala cys lie Asp Cys Ala Tyr His Gin. Lys^Thr Val Phe Ser Phe Leu 
355 360 365 

Lys Gin Gly His Gly Gly Glu Val lie Ser Ala Val Phe Asp Arg Glu 
370 380 * * 

Gin His Thr Leu Asn Leu Pro Ala. Val Asn Ser Ile Thr Tyr Val Leu 
385 390 395 400 

Arg Phe Leu Glu Lys Leu Cys His Asn Leu Arg Ser Asp Asn Leu Phe 

405 410 4!5 

Gly Asn Gin Pro Phe Thr Gin Thr His Leu Ser Leu Thr Ala lie Glu 

420 425 430 

Tyr Ser His Ser His Asp Arg Tyr Leu Pro Gly Glu Thr Phe Val Leu 
435 44 0 4<5 
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Gly Asn Ser Leu Ala Arg Ser Leu Glu Pro His Ser Asp Ser Met a™ 
450 455 460 

Ser Ala Ser Asn Pro Thr Asn Leu Val Ser Thr Ser Gin Ara His a™ 
465 470 475 4t * 

Pro Leu Leu Ser Ser Cys Gly Leu Pro Pro Ser Thr Ala Ser Ala Val 

485 490 495 

Arg Arg Leu Cys Ser Arg Gly ser Asp Arg Tyr Leu Glu Ser Arq Asd 

500 505 5xo 

Ala Ser Arg Leu Ser Gly Arg Asp Pro Sar sex Trp ?hr Val Glu Asd 
515 520 525 

Val Wet Gin Phe Val Arg Glu Ala Asp Pro Gin Leu Gly Pro His Ala 
530 535 540 

™ Leu Phe A * g Lys Hts G - u IZ * s:.yLys Alw Lfcu Leu Leu Leu 

ZU 54 b 550 555 560 

Arg Ser Asp Met Met Met Lys Tyr Met Gly Leu Lys Leu Gly Pro Ala 

565 570 575 

Leu Lys Leu Ser Tyr His He Asp Arg Leu Lys &;\n Gly Lys Phe 

580 585 590 

(2) INFORMATION FOR SEQ ID NO: 7: 



15 



25 



30 (i)* SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3065 bas* pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sincle > 
"(D) TOPOLOCrx: linear 



35 



40 



60 



(ii) MOLECULE TYPE: DMA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CTAGAATTCA GCGGCCGCTT AATTCTAGCT GGATGGGAGT GAGCCGCCCG CGCCCCGCGC 60 

45 CGCTGTCGCC CTCAGATGGA GAGATTAAAT CACAGAGAAA CTAACTTGTC AGAGGTCAGA 120 

GCAAGGTGTA GGTGGATCCA GGAATAAGTC TCAAGCTTCA TCACTCCTTG CTTAGTTTTA 180 

GGCCATTGAC TAT GCAGCCT AGTGACTGGA ATGATGTGAA AAAACCTAAG TATGGTCACT 240 

50 

TGTCAGAGTC TGCATCTCAA TATCAAGAAT CTGTTGACAT CCTGGAGCTA GCATCTAGTG 300 

CTTTTTGCAT GGCCCAAAGG GGCCCTGTGC TGCTCCACTA CAGAGGAAAA TTCAAGAAAT 360 

55 GCTGGTTTGC TACAGTGTTT TAGCTTGTGA GAGTCTCTGG GACCTTCCCT GCTCCAI CAT 420 

GGGGTCACCT CTAGGTCATT TTACCTGGGA CAAATACCTA AAAGAAACAT GTTCAGTCCC 480 

AGCGCCTGTC CATTGCTTCA AGCAGTCCTA CACACCTCCA AGTAATGAGT TCAAGATCAG 540 

CATGAAATTG GAAGCACAGG ATCCCAGGAA -CACCACATCC ACCTGTATTG CCACGGTCGT 600 

TGGATTGACA GGTGCCCGAC TTCGTCTGCG CCTTGATGGC AGTGACAACA AGAATGACTT 660 

46 
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CTGGAGACTG GTTGACTCCT CTGAAATCCA GCCAATTGGA AACTGTGAGA AGAATGGCGG 720 

GATGCTGCAG CCCCCTCTAG GATTTCGGCT GAATGCCTCC TCTTGGCCCA TGTTCCTTTT 780 

5 GAAGACACTA AATGGAGGAG AGATGGCTCC CATCAAGATT TTCCATAAGG AGCCACCATC 840 

ACCTTCCCAC AACTTCTTCA AAATGGGAAT GAAGTTAGAA GCTGTAGACA GAAAGAACCC 900 

1Q TCATTTCATT TGCCCAGCCA CTATT GGAGA AGTTCGAGGC GCAGAAGTGC TAGTCACCTT 960 

TGATGGGTGG CGAGGCGCAT TTGACTACTG GTGCCGCTTT GACTCCCGGG ACATCTTTCC 1020 

TGTGGGCTGG TGTTCTTTGA CTGGAGATAA CCTGCAGCCA GCTGGCACCA AAGTTGTGAT 1080 

15 TCCAAAGAAT CCGTCCCCTT CAT CT GATGT GAGCACTGAG AAGCCCAGCA TCCACAGCAC 1140 

CAAAA.CTGTC TTGGAGCATC AGCCAGGGCA GAGGGGCCGC AAACCAGGAA A.GAAGCGGGG 1200 

^ CCGAACACCC..AAGATCCTTA VTCCCCATCC CACCTCTACC CCATCCAAGT CAGCTGAACC 1260 

TTTGAAATTT CCAAAGAAGA GAGGTCCCAA GCCTGGCflGT AAGAGGAAAC CTCGGACTTT 1320 

GCTGAGCCCA CCAGCCACCT CACCAACAAC CAGCACCCCT GAACCGGACA CCAGCACTGT 1380 

25 TCCTCAAGAT GCTGCCACCG TCuCAAGTTC AGCCATGCAG GCCCCCACAG TTTGTATCTA 1440 

CTTGAACAAG AGCGGCAGCA CGGGCCCCCA CCTGGATAAG AAGAAG^TCC AACAACTCCC 1500 

3o TGACCATTTT GGGCCAGCCC GTGqCTqTGT GGTGCTGCAG .CAGGCTGTCC AGGCTTGOAT 1560 

TGACTGTGCT TATCACCAGA tt&C^EttCTT .(OVGCTTCCTC ' AAACAGGGCC ACGGCGGTGA 1620 

AGTCATTTCA GCCGTGTTTG ACCGGGAACA GCACACTCTG AACCTCCCAG CAGTCAACAG 1680 

35 CATCACCTAT GTCCTCCGTT TCCTGGAGAA GCTCTGCCAC AAGCTTCGAA GTGACAATCT 1740 

GTTTGGCAAC CAGCCCTTTA CACAGACTCA CTTATCACTC ACTGCCACAG AGTATAATCA 1800 

CAACCACGAC AGGTACCTAC CAGGTGAAAC CTTTGTCCTG GGGAATAGCC TGGCCCGGTC 1860 

CTTGGAGACA CACTCAGACC TGATGGATTC TGCCTTGAAG CCTGCCAACC TTGTCAGCAC 1920 

\ - ATCCCAAAAC CTTCGGACTC CTGGCTATCG GCCCTTGCTT CCCTCCTGTG GCCTCCCATT 1980 

45: AAGCACTGTC TCTGCtGTGC GTAGGCTCTG CTCTAAGGGA GTGTTAAAAG GAAAAAAGGA 2040 

AAGAAGGGAT GTGGAGTGAT'- TTTGGAAACT AAATCATTCC CCAGGGTCAG ATCGACATCT 2100 

^ GGAGAGCCGA GATCCCCCTC GCCTGAGTGG CCGGGACCCC TCCTCATGGA CAGTGGAGGA 2160 

TGTGATGCAG TTTGTCCGGG AAGCCGATCC TCAGCTTGGA TCCCATGCTG ACCTCTTCCG 2220 

AAAACATGRA ATCGATGGCA AGGCCCTGCT CCTGCTGCGC AGTGACATGA TGATGAAGTA 2280 

55 CATGGGCCTG AAGCTGGGGC CCGCCCTCAA GCTCTCCTTT CACATTGACC GGCTGAAGCA 2340 

GGGCAAGTTC TGAACAGGAG GCACTCTTCT CCCAGGAAGC CGCCCGCCAG CTCCCAGGCA 2400 

CCTTAGTAGG GCTCTGGCTG ACCTCAGGAC TCTAGGAGGC TGGAAAGCCA CCACTGCTAC 2460 

CCTTCCTGCC CTGATGTGTC CTTCCATGAA GGACTGAGGA GGGAACAGTG GGCCCGGGGC 2520 

TGGTGCTGCT CTTCCCCTTA GCCTGCTGTG GCTCCCAGGC CCTTCTATTT ATTTCTCAAG 2580 
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GCTAGCCAGC 


CTCTCTCCAC 


AAGTTTAGAC 


GAGCACCTTT CAAGAGATGA GGAAGACGCC 


2640 




AGCCCTAGGA 


CCTTGAAAGG 


CCCTGGTACC 


CAGGCCCCTT GCCACCTCCT GGGCTTGGCA 


2700 


5 


TAGTGTCCCA 


AGGCCCCCAG 


CTCATGCCTT 


CTCApTGGAT CCCCAGACTC TGAACTTATG 


2760 




GTGCAGACCT 


TTTTTAAAGA 


GATCCTTTCT 


TATTGCTAAT TTATTGCTTC TGGCGTTTGG 


2820 


10 


ACTTAATGCT 


TCTCTTGCAC 


CAAACAGTTT 


TTTGGAAGAG GGAGACCATC CTCTGGTCCA 


2880 


GAGAGGGCCT 


CTCCAGAGAA 


GTGTGGCCTA 


TTTCAGAAGA CACTGCCCTA GGGCACTTCT 


2940 




TCTCTGGAAT 


GGACAAAGTA 


TTTGGCTCAC 


TGAGCAAAAG GTGAGQGTCT CTCTTCCTAC 


3000 


15 


ACTGGGTCCT 


TTGTAGCCCC 


AGTCTTCATC 


TCTGATGGAG TTTCCCCTCA CCCTGCCCTC 


3060 




GTGCC 






^ , / t r * ' * 1 

• > 


3065 



20 



30 



45 



60 



(2) INFORMATION FOR SEQ ID NO: 8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 664 amino aci,4s 

(B) TYPE: ardrio acid 

(C) STRANDEDNESS : single 
25 (D) TOPOLOGY : linear . 



(ili MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ V .lb Vo*: 8V * 

Met Leu Val Cys Tyr Ser Val Leu Ala. Cys Glu Ser Leu Trp Asp Leu 

35 r - • 5: • ' ; 10 " -15 

Pro Cys Ser lie Met Gly Ser Pro Leu Gly. His Phe Thr Trp Asp Lys 

20 25 30 

40 Tyr Leu Lys Glu Thr Cys Ser Val Pro Ala Pro Val His Cys Phe Lys 

35 40 45 



Gin Ser Tyr Thr Pro Pro Ser Asn Glu Phe Lys lie Ser Met Lys Leu 
50 55 60 

Glu Ala Gin Asp Pro Aro; Asn Thr Thr Ser Thr Cys lie Ala Thr Val 
65 70 75 60 



Val Gly Leu Thr Gly Ala Arg Leu Arg Leu Arg Leu Asp Gly Ser Asp 

50 85 90 95 

Asn Lys Asn Asp Phe Trp, Arg Leu Val Asp Ser Ser Glu lie Gin Pro 

100 105 110 

55 He Gly Asn Cys Glu Lys Asn Gly Gly Met Leu Gin Pro Pro Leu Gly 

115 120 125 



Phe Arg Leu Asn Ala Ser Ser Trp Pro Met Phe Leu Leu Lys Thr Leu 
130 135 140 

Asn Gly Ala Glu Met Ala Pro He Lys He Phe His Lys Glu Pro Pro 

145 150 155 160 
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10 



15 



25 



30 



40 



45 



55 



60 



Ser Pro Ser His Asn Phe Phe Lys Met Gly Met Lys Leu Glu Ala Val 

165 170 175 

Asp Arg Lys Asn Pro His Phe lie Cys Pro Ala Thr He Gly Glu Val 

180 185 190 

Arg Gly Ala Glu Val Leu Val Thr Phe Asp Gly Trp Arg Gly Ala Phe 
195 200 205 

Asp Tyr Trp Cys Arg Phe Asp Ser Arg Asp He Phe Pro Val Gly Trp 
210 215 220 

Cys Ser Leu Thr Gly Asp Asn Leu Gin Pro Pro Gly Thr Lys Val Val 
225 230 235 240 

He Pro Lys Asn Pro Ser Pro Ser Ser Asp Val Ser Thr Glu Lys Pro 

245 250 % 255 



Ser He His Ser Thr Lys Thr Val Leu Glu His Gin Pro Gly Gin Ara 

20 260 265-. 270 



Gly Arg Lys Pro Gly Lys Lys Arg Gly Arg Thr Pro Lys He Leu He 
275 280 . 285 

Pro His Pro Thr Ser Thr Pro Ser Lys Ser Ala Glu Pro Leu Lys Phe 
290 295 . 300 

Pro Lys Lys Arg Gly Pro Lys Pro Gly Ser Lys Arg Lys Pro Arg Thr 
305 310 315 320 

Leu Leu Ser Pro Pro Pro Jhr Ser Pro Thr Thr Ser Thr Pro Glu Pro 

325 **" ' 330 335 



_ Asp Thr Ser Thr Val Pro Gin Asp. Ala Ala Thr Val Pro Ser Ser Ala 

35 v 340 345 350 



Met Gin Ala Pro Thr Val Cys He Tyr Leu Asn Lys Ser Gly Ser Thr 
355 360 365 

Giy Pro His Leu Asp Lys Lys Lys He Gin Gin Leu Pro Asp His Phe 
370 375 380 

Gly Pro Ala Arg Ala Ser Val Val Leu Gin Gin Ala Val Gin Ala Cys 
3fl 5 390 395 400 

He Asp Cys Ala Tyr His Gin Lys Thr Val Phe Ser Phe Leu Lys Gin 

405 410 415 



Gly His Gly Gly Glu Vai lie Ser Ala Val Phe Asp Arg Glu Gin His 
50 * 420 425 430 



Thr Leu Ash ; Leu Pro Ala Val Asn Ser lie Thr Tyr Val Leu Arg Phe 
435 440 445 

l&u Glu Lys Leu Cys His Asn Leu Arg Ser Asp Asn Leu Phe Gly Asn 
450 455 460 

Gin Pro Phe Thr Gin Thr His Leu Ser Leu Thr Ala Thr Glu Tyr Asn 
465 470 475 480 

His Asn His Asp Arg Tyr Leu Pro Gly Glu Thr Phe Val Leu Gly Asn 

485 490 495 
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Ser Leu Ala Arg Ser Leu Glu Thr His Ser Asp Leu Met Asp Ser Ala 

500 505 510 

Leu Lys Pro Ala Asn Leu Val Ser Thr Ser Gin Asn Leu Arg Thr Pro 
515 520 525 

Gly Tyr Arg Pro Leu Leu Pro Ser Cys Gly Leu Pro Leu Ser Thr Val 
530 535 540 



10 



Ser Ala Val Arg Arg Leu Cys Ser Lys Gly Val Leu Lys Gly Lys Lys 
545 550 555 560 



15 



Glu Arg Arg Asp Val Glu Ser Phe Trp Lys Leu Asn His Ser Pro Gly 

565 570 575 

Ser Asp Arg His Leu Git. Ser Arg Asp Tiv Pro Atrg Leu Ser Gly Arg 

580 585 590 



20 



Asp Pro Ser Ser Trp Thr Val Glu Asp Val Met Gin Phe Val Arg Glu 
595 600 605 



25 



Ala Asp Pro Gin Leu Gly Ser His A- 1 a Ar<p Leu Ph* Arcj Lys His Glu 
610 615 620 

lie Asp Gly Lys Ala Leu Leu Leu Leu Arg Ser Asp Met Met Met Lys 
625 630 635 640 



30 



Tyr Met Gly Leu Lys Leu Gly. Pro Ala leu Lys . Leu Ser Fhe His lie 

645 * 650 655 

Asp Arg Leu Lys Gin Gly Lys 

660 



50 



WO 97/42211 



PCT/US97/07575 



WW AT TSP T ATMFH Tfr 



1. An isolated mammalian Scm polypeptide, comprising a sequence of at least 54 
consecutive amino acids of a sequence selected from the group consisting of SEQ ID 

5 NO: 2, SEQ ID NO:4, and SEQ ID NO: 6. 

2. The polypeptide of claim 1 which comprises at least 60 consecutive amino 
acids from the selected sequence. 

3. The polypeptide of claim 1 which comprises at least 65 consecutive amino 
adds from the selected sequence. 

10 4. ■ - The polypeptide of claim 1 which comprises at least 75 consecutive amino 
acids from the selected sequence. 

5. The polypeptide of claim 1 which comprises all of the selected sequence. 

6. An isolated mammalian Scm polypeptide comprising a sequence which is at 
least 95% identical to a sequence selected from the group consisting of SEQ ID NO: 

15 2, SEQ ID NO:4, and SEQ ID NO: 6. 

7. An isolated nucleic acid molecule that encodes a polypeptide of claim 1. 

8. An isolated nucleic acid molecule comprising at least 30 contiguous nucleotides 
selected from the group of sequences consisting of SEQ ID NO: 1, SEQ ID NO:3, 
and SEQ ID NO: 5. 

20 9. The nucleic acid molecule of claim 8 which comprises all of the selected 
sequence. 

10. An isolated nucleic acid molecule which encodes a polypeptide of claim 6. 

11. An isolated nucleic acid molecule comprising a sequence which is at least 95% 
identical to a sequence selected from the group of sequences consisting of SEQ ID 

25 NO: I, SEQ ID NO:3, and SEQ ID NO: 5. 

12. An antibody preparation that specifically binds to a polypeptide of claim 6, and 
does not bind specifically to other human proteins. 

13. A method of treating a neoplasm comprising: 

contacting a neoplasm with an effective amount of a therapeutic agent 
30 comprising a mammalian Scm polypeptide which comprises a sequence selected from 
the group consisting of SEQ ID NO: 2, SEQ ID NO:4, and SEQ ID NO: 6, whereby 
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growth of the neoplasm is arrested. 

14. A method of inducing cell differentiation comprising: 

contacting a progenitor cell with a mammalian Scm polypeptide which 
comprises a sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID 
5 NO:4, and SEQ ID NO: 6, whereby differentiation of the cell is induced. 

15. A method of regulating cell growth comprising: 

contacting a cell whose growth is uncontrolled with a mammalian Scm 
polypeptide which comprises a sequence selected from the group consisting of SEQ ID 
NO: 2, SEQ ID NO:4, and SEQ ID NO: 6, whereby growth of the cell is regulated. 

10 16. A pharmaceutical composition comprising an effective amount of a therapeutic 
agent comprising a mammalian Scm polypeptide which comprises a sequence selected 
from the group consisting of SEQ ID NO; 2, SEQID N0:4; land SEQ ID NO: 6, and 
a pharmaceutical^ acceptal^ carrier. 
17. A method of diagnosis of neoplasia comprising- 

15 contacting a tissue, $2unple $u^^d^rf nflppiasia isolated from a patient with 

an mammalian Scm gene probe wn^i^fr at least 1^ contiguous nucleotides of a 
sequence selected from the g^yp wnasjing.of IP.NO; 1, SEQ ID NO: 3, and 
SEQ ID NO: 5 r wherein attissuei^ph wderejqtf^se^^ Scm or expresses a 

variant mammalian Scm is categorized as ^neoplastic. -rr ■? r< 

20 18, The method of claim 17 wherein undqrexpression is determined by comparison 
to a normal tissue of the patient. 

19. The method of claim 17 whejein, 3 variant mammalian Scm is determined by 
comparison to a normal tissue of the patient 

20. The method of claim 17,wherein said neoplasm is selected from the group 
25 consisting of colorectal adenocarcinoma, lung carcinoma, ; melanoma, lymphoma, and 

leukemia, ( : . , 

21. A method of diagnosipg^neoplasia comprising; 

contacting PCR primers which specifically hybridize with an mammalian San 
gene sequence selected from the group consisting of SEQ ID NO: 1 , SEQ ID NO: 
30 3, and SEQ ID NO: 5, with nucleic acids isolated from a tissue suspected of 
neoplasia; 
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amplifying mammalian San sequences in the nucleic acids of the tissue; and 
detecting a mutation in the amplified sequence, wherein a mutation is 
identified when the amplified sequence differs from a sequence similarly amplified 
from a normal human tissue. 
5, 22. A method of diagnosing neoplasia comprising: 

contacting a bDNA probe with nucleic acids isolated from a tissue suspected of 
neoplasia, wherein the bDNA probe specifically hybridizes with an mammalian San 
gene sequence selected from the group consisting of SEQ ID NO: 1 , SEQ ID NO: 
3, and SEQ ID NO: 5;' ' 
10 ■ detecting hybrids formed between the bDNA probe and nucleic acids isolated 

; from the tissue; and 
J\ M identifying 7 arnatation in thd nucleic acids isolated from the tissue by 
comparing the hybrids formed with hybrids similaHy formed using nucleic acids from 
a normal human ti&ue, 1 ^ ■ * 
15 23. A riiethod of ciighos^g rcuipiasia coniprising: 

contesting a tissue xjk^c suspected of being neoplastic with an antibody 
refected from the group consisting of: one which specifically binds to wild-type 
namtfalaan Scm as ahowh in SEQ u) NO:2, 4, or 6, or one which specifically binds 
to an expressed mammalian 5cm variant; 
20 > . detecting binding of the antibody to components of the tissue sample, wherein 
a difference in the binding of the antibody to components of the tissue sample, as 
compared to binding of the antibody So a normal human tissue sample indicates 
neoplasia of the tissue. 

24 . A method of diagnosing neoplasia comprising: 
25 contacting RNA from a tissue suspected of being neoplastic with PGR primers 

which specifically hybridize to an mammalian San gene sequence as shown in SEQ 
ID NO: 1, 3, or 5, or a bDNA probe which specifically hybridizes to said sequence; 

determining quantitative levels of mammalian San RNA in the tissue by PGR 
30 amplification or bDNA probe detection, wherein lower levels' of mammalian San 
RNA as compared to a normal human tissue indicate neoplasia. 
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25. An isolated nucleic acid molecule which comprises a sequence of at least 20 
contiguous nucleotides of a 5* untranslated region of an mammalian San gene, for use 
in regulating a heterologous coding sequence coordinately with mammalian 5cm. 

26. An isolated nucleic acid molecule which comprises a sequence of at least 20 

5 contiguous nucleotides of a 3* untranslated region of an mammalian San gene, for use 
in regulating a heterologous coding sequence coordinately with mammalian San. 

27. An isolated, nucleic acid molecule which comprises at least 20 contiguous 
nucleotides of a promoter region of an mammalian San gene, for use in regulating a 
heterologous coding sequence coordinately wi& mammalian San: ' v : M • 

10 28. An isolated nucleic acid molequle which comprises ^ least ^contiguous 
nucleotides of an intron of an mammalian Scwrgejie. fpriise in regujating a 
heterologous coding sequence coordinately vn\h m^ . 
29. A method of identifying modulators of m^malian Scm fiihction compri^ 

contacting a test subs^^wit^ 

15 mammalian Scm gene or a reporter cohstrufef fibmptisihg in mammalian San promoter 

and a reporter gene; .-yrvA * ■ i.." ..-^os-'^r 1 . 

q\^titatlng1^ 

transcription in the presence and absence of the test substance, wherein a test 
substance which increases transcription is a candidate drug for antineoplastic therapy. 
20 30. The method of claim 29 wherein transcription is quantitated indirectly by 
measuring the gene product or a reaction product thereof* 

31. A vector comprising the nucleic acid molecule of claim 7. 

32. A vector comprising the nucleic acid molecule of claim 8, 

33. A vector comprising the nucleic acid molecule of claim 9. 
25 34. A vector romprising the nucleic add molecul^ of claim 10. 

35. A vector comprising the nucleic acid molecule of claim 11. 
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3 . As only some of the required additional search fees were timely paid by the applicant, this international search report covers 

only those claims for which fees were paid, specificaUy claimi Not.: 



4 * Q0 No rc ^ uircd «<Witiona| search fees were timely paid by the applicant. Consequently, this international search report is 
restricted to the invention first mentioned in the claims; k is covered by claims Nos.: 
1-11. 13. 16 and 31-35 



Remark on Protest 



□ 
□ 



The additional search fees were accompanied by the applicant's protest. 
No protest accompanied the payment of additional search fees. 
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BOX II. OBSERVATIONS WHERE UNITY OF INVENTION WAS LACKING 




This ISA found multiple inventions as follows: 





This application contains the following inventions or groups of inventions which are not so linked as to form a single 
inventive concept under PCT Rule 13.1. 



Group I, claim(s) 1-$, 13 and 16, drawn to Scm polypeptide, method of use for treating neoplasia and pharmaceutical 
compound containing the Scm polypeptide and claim(s) 7-11 and 31-35, drawn to nucleic acid encoding the Scm 
polypeptide and vectors containing the nucleic acid. 

Group II, claim(s) 12, drawn to an antibody specific for the Scm polypeptide. 
Group HI, claim(s) 14, drawn to method of inducing cell differentiation. 
Group IV, claim(s) 15. drawn to a method of regulating cell growth. 

Group V, claim(s) 17-20, drawn to a method of diagnosing neoplasia with DNA hybridization. 
Group VI, claim(s) 21, drawn to a method of diagnosing neoplasia using PCR. * 
Group VII, claim(s) 22, drawn to a method of diagnosing neoplasia using bDNA. 
Group VIII, claim(s) 23, drawn to a method of diagnosing using an antibody. 
Group IX, claim(s) 24, drawn to a method of diagnosing using RNA. 

Group X, claim(s) 25, drawn to a nucleic acid molecule containing the 5 prime untranslated region of the Scm gene. 
Group XI, claim(s) 26, drawn to a nucleic acid molecule containing the 3 prime untranslated region of the Scm gene. 
Croup. XI» clairat>> 27, drawn to rn nucleic acid molecule containing the pton.oter region of the Scm gene. 
Group XIII, claim(s) 28. drawn to a_nucle_ic acid molecule containing the intron region of the Scm gene. 
Group XIV. claims 29-30, drawn to. a method of identifying modulators of the Scm function. 



The inventions listed as Croups I-XV do not relate to a single inventive concept under PCT Rule 13.1 because, under 
PCT Rule 13.2, they lack the same or corresponding special technical features for the following reasons: The product 
Groups MI and X-XIII differ from the method Groups IH-IX and XIV in that they each recite a special technical feature 
of a composmon or product that is not found in the method Groups. For the product claims in Groups MI. X-XIII each 
product has a special technical feature of being an Scm protein or Scm DNA, Scm antibody, Scm 5 prime region DNA, 
Scm, 3 prime region DNA, Scm promoter region DNA and Scm intron region DNA. respectively, that is not found as ' 
a special technical feature in the other groups, respectively. The Scm cDNA of Group I has the special technical 
feature of encoding Scm protein that is not found in untranslated regions of the Scm DNA (Groups X-XIII) The 
various untranslated regions of Groups X-XIH have the special technical feature of being involved in regulation of 
m RNA start position, mRNA stability/ regulation of gene expression and tissue specific regulation, respectively, that is 
not found in the other Groups. The Scm antibody has the special technical feature of binding to the Scm protein which 
is not found in the other Groups. 

■ : For the method groups HMX and XIV, each method has a special technical feature of inducing cell 
differentiation, regulating cell growth, diagnoses by hybridization, diagnosis by PCR, diagnosis by bDNA, diagnosis by 
antibody binding, diagnosis by RNA and identification of Scm modulators that is not found in the other groups 
respectively. Moreover, the method groups HMX and XIV differ from the method of Group I in that the method of 
Group I recites the special technical feature of treating neoplasia that is not found in any of the other Groups 
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